• Stars
    star
    2
  • Language
    Python
  • License
    MIT License
  • Created about 2 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code, models, and data for "Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects". Findings of ACL, 2022.

More Repositories

1

camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Python
376
star
2

CAMeLBERT

Code and models for "The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models". EACL 2021, WANLP.
Python
40
star
3

Arabic_ALA-LC_Romanization

Romanizing Arabic bibliographic records in the ALA-LC standard.
Jupyter Notebook
13
star
4

WIDH_2020_Arabic_Text_Analysis

Material for the Text Analysis of Arabic course taught at the NYU Abu Dhabi Winter Institute in Digital Humanities 2020.
Jupyter Notebook
12
star
5

samer-add-on

HTML
9
star
6

arabic_error_type_annotation

The Arabic Error Type Annotation tool aims to annotate Arabic error types following the ALC tagset annotation.
Python
9
star
7

palmyra

HTML
8
star
8

Gumar-Ngrams

The complete [1 to 5]-gram Gumar Corpus in the style of Google n-grams.
8
star
9

arabic-gec

Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.
Python
7
star
10

camel_parser

Python
6
star
11

camel_morph

Camel Morphโ€™s goal is to build large open-source morphological models for Arabic and its dialects across many genres and domains.
Python
5
star
12

deSeg

Unsupervised, De-lexical, Linguistic Segmentation
Python
5
star
13

gender-reinflection

Code, models, and data for "Gender-Aware Reinflection using Linguistically Enhanced Neural Models". COLING 2020, GeBNLP.
Python
5
star
14

ced_word_alignment

A character edit distance based word aligner.
Python
4
star
15

camel-tools-data

Repo containing data packages and catalogues used by CAMeL Tools.
4
star
16

arafix_ocr

A tool for improving the output of generic Arabic OCR systems using an n-gram based post-correction approach.
HTML
4
star
17

TOIA-2.0

Jupyter Notebook
2
star
18

muddler

The Muddler derived-file sharing utility.
Python
2
star
19

gender-rewriting

Code, models, and data for "User-Centric Gender Rewriting". NAACL 2022.
Python
2
star
20

seq2seq-transliteration-tool

Python
2
star
21

maknuune_lexicon

TeX
2
star
22

camel-guidelines

TeX
1
star
23

HierarchicalArabicDialectID

Jupyter Notebook
1
star
24

Arabic-ATB-closed-class-list

A Modern Standard Arabic Closed-Class Word List
1
star
25

qalb

Code for "Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models"
Python
1
star
26

gender-rewriting-shared-task

Evaluation code and data for the gender rewriting shared task
Python
1
star