CAMeL Lab (@CAMeL-Lab)

Top repositories

1

camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Python
376
star
2

CAMeLBERT

Code and models for "The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models". EACL 2021, WANLP.
Python
40
star
3

Arabic_ALA-LC_Romanization

Romanizing Arabic bibliographic records in the ALA-LC standard.
Jupyter Notebook
13
star
4

WIDH_2020_Arabic_Text_Analysis

Material for the Text Analysis of Arabic course taught at the NYU Abu Dhabi Winter Institute in Digital Humanities 2020.
Jupyter Notebook
12
star
5

samer-add-on

HTML
9
star
6

arabic_error_type_annotation

The Arabic Error Type Annotation tool aims to annotate Arabic error types following the ALC tagset annotation.
Python
9
star
7

palmyra

HTML
8
star
8

Gumar-Ngrams

The complete [1 to 5]-gram Gumar Corpus in the style of Google n-grams.
8
star
9

arabic-gec

Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.
Python
7
star
10

camel_parser

Python
6
star
11

camel_morph

Camel Morphโ€™s goal is to build large open-source morphological models for Arabic and its dialects across many genres and domains.
Python
5
star
12

deSeg

Unsupervised, De-lexical, Linguistic Segmentation
Python
5
star
13

gender-reinflection

Code, models, and data for "Gender-Aware Reinflection using Linguistically Enhanced Neural Models". COLING 2020, GeBNLP.
Python
5
star
14

ced_word_alignment

A character edit distance based word aligner.
Python
4
star
15

camel-tools-data

Repo containing data packages and catalogues used by CAMeL Tools.
4
star
16

arafix_ocr

A tool for improving the output of generic Arabic OCR systems using an n-gram based post-correction approach.
HTML
4
star
17

TOIA-2.0

Jupyter Notebook
2
star
18

muddler

The Muddler derived-file sharing utility.
Python
2
star
19

gender-rewriting

Code, models, and data for "User-Centric Gender Rewriting". NAACL 2022.
Python
2
star
20

seq2seq-transliteration-tool

Python
2
star
21

maknuune_lexicon

TeX
2
star
22

CAMeLBERT_morphosyntactic_tagger

Code, models, and data for "Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects". Findings of ACL, 2022.
Python
2
star
23

camel-guidelines

TeX
1
star
24

HierarchicalArabicDialectID

Jupyter Notebook
1
star
25

Arabic-ATB-closed-class-list

A Modern Standard Arabic Closed-Class Word List
1
star
26

qalb

Code for "Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models"
Python
1
star
27

gender-rewriting-shared-task

Evaluation code and data for the gender rewriting shared task
Python
1
star