• Stars
    star
    5
  • Rank 2,785,006 (Top 57 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 4 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scripts for crawling the 500 most visited websites in Thailand according to Alexa for `th` and `en` parallel texts.

More Repositories

1

thai2transformers

Pretraining transformer based Thai language models
Jupyter Notebook
111
star
2

wav2vec2-large-xlsr-53-th

Finetune wav2vec2-large-xlsr-53 with Thai Common Voice Corpus 7.0
Jupyter Notebook
43
star
3

Thai-NNER

Pytorch implementation of paper: Thai Nested Named Entity Recognition
Python
35
star
4

dataset-releases

34
star
5

commonvoice-th

Kaldi recipe to train commonvoice corpus in Thai language
Shell
28
star
6

thai2nmt

English-Thai Machine Translation Models
Python
26
star
7

WangchanX

WangchanX Fine-tuning Pipeline
Jupyter Notebook
23
star
8

mt-opus

English-Thai Machine Translation with OPUS data
Jupyter Notebook
18
star
9

ai-builders

A program for kids who want to build good AI
Jupyter Notebook
18
star
10

crfcut

Thai sentence segmentation with conditional random fields
Jupyter Notebook
15
star
11

vistec-ser

Speech Emotion Recognition using PyTorch sponsored by AIS and VISTEC-DEPA AIResearch Institute Thailand.
Python
15
star
12

model-releases

13
star
13

wangchan-analytica

Business Analytics class at VISTEC
Jupyter Notebook
8
star
14

WangchanX-Eval

WangchanX Eval
Python
8
star
15

colab

Collections of Google Colab notebooks and some data.
Jupyter Notebook
7
star
16

sme-depa

Help small businesses make money from their transaction data; workshop at depa
Jupyter Notebook
7
star
17

WSSET

TF2 implementation of paper: Self-supervised Deep Metric Learning for Pointsets, ICDE 2021
Python
7
star
18

WangchanLion

Python
4
star
19

mt-datasets

Collecting bi-/tri-lingual sources for MT workstream
Jupyter Notebook
4
star
20

thwiki-text

Python
4
star
21

fake_reviews

Generate fake Amazon review datasets for VISTEC-depa machine translation project using CTRL
Jupyter Notebook
3
star
22

thai2nmt_preprocess

Python
3
star
23

pdf2parallel

Extract en-th parallel sentences from PDFs
Python
2
star
24

paracrawl-en-th

Replicate paracrawl for en-th parallel texts
Jupyter Notebook
2
star
25

ai-builders-orientation

Lesson 0 - Orientation
Jupyter Notebook
1
star
26

ai2api

Productionize NLP models trained on Pytorch by AIResearch.in.th
Jupyter Notebook
1
star
27

ner_workshop

1
star
28

scb_workshop

1
star