• Stars
    star
    8
  • Rank 2,099,232 (Top 42 %)
  • Language
    Python
  • License
    MIT License
  • Created about 10 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Trained taggers, tokenizers, etc. for the CLTK

More Repositories

1

cltk

The Classical Language Toolkit
Python
836
star
2

tutorials

Tutorials for the CLTK
Jupyter Notebook
52
star
3

cltk_frontend

Reading environment connecting to API from cltk/cltk_api repo
CSS
21
star
4

greek_lexica_perseus

Lexica and lemmata for the Ancient Greek language, from various sources
Python
19
star
5

sanskrit_text_gitasupersite

sanskrit monolingual corpus
Python
17
star
6

lat_text_latin_library

Collected files from thelatinlibrary.com
Python
17
star
7

sanskrit_text_dcs

Sanskrit Corpus
15
star
8

cltk_api

RESTful API for the CLTK
Python
13
star
9

annotations

A tool for annotating texts using Draft.js
JavaScript
13
star
10

greek_treebank_perseus

Greek treebank from the Perseus Digital Library
Python
12
star
11

latin_pos_lemmata_cltk

Python
11
star
12

grc_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
Python
9
star
13

sanskrit_parallel_sacred_texts

This Repository contains parallel Sanskrit and English Documents.
Python
9
star
14

grc_text_perseus

Collected Greek files from the Perseus Digital Library
Python
9
star
15

lat_text_perseus

Collected Latin files from the Perseus Digital Library
Python
8
star
16

sanskrit_parallel_gitasupersite

Parallel corpus
Python
7
star
17

grc_software_tlgu

Utility for converting TLG & PHI corpora to Unicode
C
7
star
18

latin_proper_names_cltk

A list of ~40K Classical Latin proper names
Python
7
star
19

sanskrit_text_wikisource

Python
6
star
20

cltk_docker

Docker script for cltk
Python
6
star
21

lat_text_tesserae

Plaintext files with Latin texts from the Tesserae Project
HTML
6
star
22

marathi_text_wikisource

Python
6
star
23

grc_text_tesserae

Plaintext files with Ancient Greek texts from the Tesserae Project
Jupyter Notebook
5
star
24

sanskrit_text_jnu

Sanskrit Corpora
5
star
25

greek_training_set_sentence_cltk

Training sets and tokenizer for the Classical Greek language, for use with CLTK
Python
5
star
26

telugu_text_wikisource

Classical Telugu texts from Wikisource
Python
5
star
27

latin_lexica_perseus

Lexica and lemmata for the Latin language, from various sources
Python
5
star
28

lapos

Fork of the Lookahead Part-Of-Speech (Lapos) Tagger
C++
5
star
29

ang_models_cltk

Python
4
star
30

greek_proper_names_cltk

A list of ~144K Classical Greek proper names
Python
4
star
31

sanskrit_text_sacred_texts

Sanskrit texts from sacred-texts.com
Python
4
star
32

punjabi_text_gurban

Punjabi Files of Gurbani
Python
4
star
33

greek_word2vec_cltk

Greek Word2Vec models
4
star
34

bengali_text_wikisource

Python
3
star
35

sanskrit_text_sanskrit_documents

Python
3
star
36

english_texts_wikisource

3
star
37

pali_text_ptr_tipitaka

Pali Tipitaka packaged with the Digital Pali Reader
JavaScript
3
star
38

latin_training_set_sentence_cltk

Training sets and tokenizer for the Latin language, for use with CLTK
Python
3
star
39

latin_treebank_perseus

Latin treebank from the Perseus Digital Library
Python
3
star
40

middle_english_text_cmepv

Texts from Corpus of Middle English Prose and Verse
Perl
2
star
41

arabic_morphology_quranic-corpus

2
star
42

old_norse_texts_heimskringla

Texts retrieved from Heimskrinla.no for easy use with cltk!
HTML
2
star
43

latin_word2vec_cltk

Latin Word2Vec models
2
star
44

greek_pos_edit_xenophon_anabasis

A human–editable version of a POS–tagged text of Xenophon's Anabasis
Python
2
star
45

old-norse-lemmatizer

Jupyter Notebook
2
star
46

sanskrit_pos_jnu_tagged

2
star
47

old_norse_text_perseus

Python
2
star
48

old_english_text_sacred_texts

HTML
2
star
49

old_norse_runes_corpus

Python
2
star
50

latin_text_corpus_grammaticorum_latinorum

Collected Latin Data from Corpus Grammaticorum Latinorum
2
star
51

alatinparser

ALP (A Latin Parser) is a syntactic parser for a small subset of classical Latin.
Prolog
2
star
52

hindi_text_ltrc

Corpus of Raw text for Classical Hindi
HTML
2
star
53

cltk.github.io

Static website for CLTK organization, built with Jekyll
SCSS
1
star
54

enm_models_cltk

Models for Middle English provided by CLTK
1
star
55

germanic_models_cltk

Python
1
star
56

cltk_grc_liddell_scott_intermediate

1
star
57

cltk_api_v2

Python
1
star
58

sql_db_quranic

This data base contains the Quran Holly Book
PLpgSQL
1
star
59

latin_text_poeti_ditalia

Corpus for Italian Poetry in Latin
HTML
1
star
60

gml_models_cltk

1
star
61

tibetan_pos_tdc

POS tagged corpora from Tibetan in Digital Communication
1
star
62

cltkv1

Experimental repo for new API CLTK
Python
1
star
63

pali_texts_gretil

Python
1
star
64

latin_text_lacus_curtius

Collected Latin files from LacusCurtius
HTML
1
star
65

san_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
1
star
66

gmh_models_cltk

Stored data for tagging Middle High German
Python
1
star
67

arabic_text_perseus

corpus for Classical arabic
1
star
68

french_text_wikisource

Collected texts from wikisource.org
1
star
69

old_church_slavonic_ccmh

Python
1
star
70

cltk_community_api

JavaScript
1
star
71

chinese_text_cbeta_02

Chinese Buddhist scriptures from CBETA
Python
1
star
72

escriptorium-deploy

Scripts to deploy the eScriptorium OCR system
Shell
1
star
73

greek_software_tlgu_python

A python wrapper for greek_software_tlgu
C
1
star
74

latin_text_antique_digiliblt

Antique Latin Corpus from digilibLT
1
star
75

prakrit_texts_gretil

HTML
1
star
76

capitains_text_corpora

Processed docs from capitains_corpora_converter
1
star
77

texts_server

Ruby
1
star