CAMeL Lab (@CAMeL-Lab)

Top repositories

1

camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Python
404
star
2

CAMeLBERT

Code and models for "The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models". EACL 2021, WANLP.
Python
43
star
3

Arabic_ALA-LC_Romanization

Romanizing Arabic bibliographic records in the ALA-LC standard.
Jupyter Notebook
17
star
4

arabic-gec

Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.
Python
13
star
5

WIDH_2020_Arabic_Text_Analysis

Material for the Text Analysis of Arabic course taught at the NYU Abu Dhabi Winter Institute in Digital Humanities 2020.
Jupyter Notebook
12
star
6

samer-add-on

HTML
9
star
7

arabic_error_type_annotation

The Arabic Error Type Annotation tool aims to annotate Arabic error types following the ALC tagset annotation.
Python
9
star
8

palmyra

JavaScript
8
star
9

Gumar-Ngrams

The complete [1 to 5]-gram Gumar Corpus in the style of Google n-grams.
8
star
10

arafix_ocr

A tool for improving the output of generic Arabic OCR systems using an n-gram based post-correction approach.
HTML
7
star
11

camel_parser

Python
7
star
12

camel_morph

Camel Morphโ€™s goal is to build large open-source morphological models for Arabic and its dialects across many genres and domains.
Python
5
star
13

deSeg

Unsupervised, De-lexical, Linguistic Segmentation
Python
5
star
14

gender-reinflection

Code, models, and data for "Gender-Aware Reinflection using Linguistically Enhanced Neural Models". COLING 2020, GeBNLP.
Python
5
star
15

ced_word_alignment

A character edit distance based word aligner.
Python
4
star
16

camel-tools-data

Repo containing data packages and catalogues used by CAMeL Tools.
4
star
17

Camel_Arabic_Frequency_Lists

The repository for the CAMeL Arabic Frequency Lists dataset
3
star
18

TOIA-2.0

Jupyter Notebook
2
star
19

muddler

The Muddler derived-file sharing utility.
Python
2
star
20

samer-arabic-readability

Code, models, and data for "Strategies for Arabic Readability Modelling". ArabicNLP 2024, ACL.
Python
2
star
21

gender-rewriting

Code, models, and data for "User-Centric Gender Rewriting". NAACL 2022.
Python
2
star
22

seq2seq-transliteration-tool

Python
2
star
23

CAMeLBERT_morphosyntactic_tagger

Code, models, and data for "Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects". Findings of ACL, 2022.
Python
2
star
24

maknuune_lexicon

TeX
2
star
25

camel-guidelines

TeX
1
star
26

HierarchicalArabicDialectID

Jupyter Notebook
1
star
27

Arabic-ATB-closed-class-list

A Modern Standard Arabic Closed-Class Word List
1
star
28

wild_diacritics

Wild Diacritics paper repo.
Python
1
star
29

qalb

Code for "Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models"
Python
1
star
30

gender-rewriting-shared-task

Evaluation code and data for the gender rewriting shared task
Python
1
star