Classical Language Toolkit (@cltk)

Top repositories

1

cltk

The Classical Language Toolkit
Python
836
star
2

tutorials

Tutorials for the CLTK
Jupyter Notebook
52
star
3

cltk_frontend

Reading environment connecting to API from cltk/cltk_api repo
CSS
21
star
4

greek_lexica_perseus

Lexica and lemmata for the Ancient Greek language, from various sources
Python
19
star
5

sanskrit_text_gitasupersite

sanskrit monolingual corpus
Python
17
star
6

lat_text_latin_library

Collected files from thelatinlibrary.com
Python
17
star
7

sanskrit_text_dcs

Sanskrit Corpus
15
star
8

cltk_api

RESTful API for the CLTK
Python
13
star
9

annotations

A tool for annotating texts using Draft.js
JavaScript
13
star
10

greek_treebank_perseus

Greek treebank from the Perseus Digital Library
Python
12
star
11

latin_pos_lemmata_cltk

Python
11
star
12

grc_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
Python
9
star
13

sanskrit_parallel_sacred_texts

This Repository contains parallel Sanskrit and English Documents.
Python
9
star
14

grc_text_perseus

Collected Greek files from the Perseus Digital Library
Python
9
star
15

lat_text_perseus

Collected Latin files from the Perseus Digital Library
Python
8
star
16

lat_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
Python
8
star
17

sanskrit_parallel_gitasupersite

Parallel corpus
Python
7
star
18

grc_software_tlgu

Utility for converting TLG & PHI corpora to Unicode
C
7
star
19

latin_proper_names_cltk

A list of ~40K Classical Latin proper names
Python
7
star
20

sanskrit_text_wikisource

Python
6
star
21

cltk_docker

Docker script for cltk
Python
6
star
22

lat_text_tesserae

Plaintext files with Latin texts from the Tesserae Project
HTML
6
star
23

marathi_text_wikisource

Python
6
star
24

grc_text_tesserae

Plaintext files with Ancient Greek texts from the Tesserae Project
Jupyter Notebook
5
star
25

sanskrit_text_jnu

Sanskrit Corpora
5
star
26

greek_training_set_sentence_cltk

Training sets and tokenizer for the Classical Greek language, for use with CLTK
Python
5
star
27

telugu_text_wikisource

Classical Telugu texts from Wikisource
Python
5
star
28

latin_lexica_perseus

Lexica and lemmata for the Latin language, from various sources
Python
5
star
29

lapos

Fork of the Lookahead Part-Of-Speech (Lapos) Tagger
C++
5
star
30

ang_models_cltk

Python
4
star
31

greek_proper_names_cltk

A list of ~144K Classical Greek proper names
Python
4
star
32

sanskrit_text_sacred_texts

Sanskrit texts from sacred-texts.com
Python
4
star
33

punjabi_text_gurban

Punjabi Files of Gurbani
Python
4
star
34

greek_word2vec_cltk

Greek Word2Vec models
4
star
35

bengali_text_wikisource

Python
3
star
36

sanskrit_text_sanskrit_documents

Python
3
star
37

english_texts_wikisource

3
star
38

pali_text_ptr_tipitaka

Pali Tipitaka packaged with the Digital Pali Reader
JavaScript
3
star
39

latin_training_set_sentence_cltk

Training sets and tokenizer for the Latin language, for use with CLTK
Python
3
star
40

latin_treebank_perseus

Latin treebank from the Perseus Digital Library
Python
3
star
41

middle_english_text_cmepv

Texts from Corpus of Middle English Prose and Verse
Perl
2
star
42

arabic_morphology_quranic-corpus

2
star
43

old_norse_texts_heimskringla

Texts retrieved from Heimskrinla.no for easy use with cltk!
HTML
2
star
44

latin_word2vec_cltk

Latin Word2Vec models
2
star
45

greek_pos_edit_xenophon_anabasis

A human–editable version of a POS–tagged text of Xenophon's Anabasis
Python
2
star
46

old-norse-lemmatizer

Jupyter Notebook
2
star
47

sanskrit_pos_jnu_tagged

2
star
48

old_norse_text_perseus

Python
2
star
49

old_english_text_sacred_texts

HTML
2
star
50

old_norse_runes_corpus

Python
2
star
51

latin_text_corpus_grammaticorum_latinorum

Collected Latin Data from Corpus Grammaticorum Latinorum
2
star
52

alatinparser

ALP (A Latin Parser) is a syntactic parser for a small subset of classical Latin.
Prolog
2
star
53

hindi_text_ltrc

Corpus of Raw text for Classical Hindi
HTML
2
star
54

cltk.github.io

Static website for CLTK organization, built with Jekyll
SCSS
1
star
55

enm_models_cltk

Models for Middle English provided by CLTK
1
star
56

germanic_models_cltk

Python
1
star
57

cltk_grc_liddell_scott_intermediate

1
star
58

cltk_api_v2

Python
1
star
59

sql_db_quranic

This data base contains the Quran Holly Book
PLpgSQL
1
star
60

latin_text_poeti_ditalia

Corpus for Italian Poetry in Latin
HTML
1
star
61

gml_models_cltk

1
star
62

tibetan_pos_tdc

POS tagged corpora from Tibetan in Digital Communication
1
star
63

cltkv1

Experimental repo for new API CLTK
Python
1
star
64

pali_texts_gretil

Python
1
star
65

latin_text_lacus_curtius

Collected Latin files from LacusCurtius
HTML
1
star
66

san_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
1
star
67

gmh_models_cltk

Stored data for tagging Middle High German
Python
1
star
68

arabic_text_perseus

corpus for Classical arabic
1
star
69

french_text_wikisource

Collected texts from wikisource.org
1
star
70

old_church_slavonic_ccmh

Python
1
star
71

cltk_community_api

JavaScript
1
star
72

chinese_text_cbeta_02

Chinese Buddhist scriptures from CBETA
Python
1
star
73

escriptorium-deploy

Scripts to deploy the eScriptorium OCR system
Shell
1
star
74

greek_software_tlgu_python

A python wrapper for greek_software_tlgu
C
1
star
75

latin_text_antique_digiliblt

Antique Latin Corpus from digilibLT
1
star
76

prakrit_texts_gretil

HTML
1
star
77

capitains_text_corpora

Processed docs from capitains_corpora_converter
1
star
78

texts_server

Ruby
1
star