UBC Deep Learning & NLP Lab (@UBC-NLP)

Top repositories

1

marbert

UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic
98
star
2

araT5

AraT5: Text-to-Text Transformers for Arabic Language Understanding
78
star
3

turjuman

TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).
Python
51
star
4

dl-nlp-rg

Deep Learning for Natural Language Processing Reading Group | University of British Columbia (UBC)
Jupyter Notebook
39
star
5

deeplearning-nlp2018

UBC Deep Learning for Natural Language Processing Course
Jupyter Notebook
38
star
6

aoc_id

Arabic Dialect Identification on AOC data.
Python
22
star
7

afrolid

AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
Python
22
star
8

AraNet

Python
20
star
9

dlnlp2019

UBC Deep Learning for Natural Language Processing Course (2019)
16
star
10

peacock

This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.
16
star
11

megacov

Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19
13
star
12

EmoNet

Python
11
star
13

ara_emotion_naacl2018

This repository provides our datasets for Arabic emotion detection in Twitter
9
star
14

microdialects

Documenting work on micro-dialects
Jupyter Notebook
8
star
15

IndT5

IndT5: A Text-to-Text Transformer for 10 Indigenous Languages
8
star
16

wanlp2020_arabic_fake_news_detection

Machine Generation and Detection of Arabic Manipulated and Fake News
8
star
17

dlr

Deep Learning Research (The University of British Columbia)
7
star
18

orca

ORCA is a large-scale Arabic Language Understanding Evaluation Benchmark
Python
7
star
19

serengeti

SERENGETI: Massively Multilingual Language Models for Africa
Jupyter Notebook
7
star
20

python2021

6
star
21

python2020

Jupyter Notebook
6
star
22

nadi

Nuanced Arabic Dialect Identification Shared Tasks (NADI) 2020 and 2021
Python
5
star
23

DL2022

Trends in Deep Learning Seminar at UBC
Jupyter Notebook
5
star
24

dialex

DiaLex - A Benchmark for Evaluating Multidialectal Arabic Word Embeddings
Jupyter Notebook
4
star
25

africaNLP2021

3
star
26

L2ASR

Python
3
star
27

LMBERT

Python
3
star
28

octopus

Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)
Python
3
star
29

itrustai-tutorials

Jupyter Notebook
2
star
30

coling2020_machine_generated_text

Automatic Detection of Machine Generated Text: A Critical Survey
2
star
31

SPARROW

EMNLP 2023
2
star
32

araStance

1
star
33

MDS-CL

JavaScript
1
star
34

OCR

Topics related to OCR
HTML
1
star
35

infodcl

Python
1
star