• Stars
    star
    5
  • Rank 2,861,937 (Top 57 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scripts for crawling the 500 most visited websites in Thailand according to Alexa for `th` and `en` parallel texts.

More Repositories

1

thai2transformers

Pretraining transformer based Thai language models
Jupyter Notebook
116
star
2

wav2vec2-large-xlsr-53-th

Finetune wav2vec2-large-xlsr-53 with Thai Common Voice Corpus 7.0
Jupyter Notebook
45
star
3

WangchanX

WangchanX Fine-tuning Pipeline
Jupyter Notebook
42
star
4

Thai-NNER

Pytorch implementation of paper: Thai Nested Named Entity Recognition
Python
39
star
5

dataset-releases

36
star
6

commonvoice-th

Kaldi recipe to train commonvoice corpus in Thai language
Shell
33
star
7

thai2nmt

English-Thai Machine Translation Models
Python
27
star
8

mt-opus

English-Thai Machine Translation with OPUS data
Jupyter Notebook
19
star
9

ai-builders

A program for kids who want to build good AI
Jupyter Notebook
18
star
10

vistec-ser

Speech Emotion Recognition using PyTorch sponsored by AIS and VISTEC-DEPA AIResearch Institute Thailand.
Python
17
star
11

crfcut

Thai sentence segmentation with conditional random fields
Jupyter Notebook
15
star
12

model-releases

14
star
13

WangchanX-Eval

WangchanX Eval
Python
9
star
14

wangchan-analytica

Business Analytics class at VISTEC
Jupyter Notebook
8
star
15

colab

Collections of Google Colab notebooks and some data.
Jupyter Notebook
7
star
16

sme-depa

Help small businesses make money from their transaction data; workshop at depa
Jupyter Notebook
7
star
17

WSSET

TF2 implementation of paper: Self-supervised Deep Metric Learning for Pointsets, ICDE 2021
Python
7
star
18

WangchanLion

Python
5
star
19

mt-datasets

Collecting bi-/tri-lingual sources for MT workstream
Jupyter Notebook
4
star
20

thwiki-text

Python
4
star
21

fake_reviews

Generate fake Amazon review datasets for VISTEC-depa machine translation project using CTRL
Jupyter Notebook
3
star
22

thai2nmt_preprocess

Python
3
star
23

Bilingual-Financial-NER-Model

Python
3
star
24

ai-builders-orientation

Lesson 0 - Orientation
Jupyter Notebook
2
star
25

pdf2parallel

Extract en-th parallel sentences from PDFs
Python
2
star
26

paracrawl-en-th

Replicate paracrawl for en-th parallel texts
Jupyter Notebook
2
star
27

ai2api

Productionize NLP models trained on Pytorch by AIResearch.in.th
Jupyter Notebook
1
star
28

ner_workshop

1
star
29

capital_market_text_data

1
star
30

scb_workshop

1
star