• Stars
    star
    1
  • Language
    Python
  • Created about 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Stored data for tagging Middle High German

More Repositories

1

cltk

The Classical Language Toolkit
Python
836
star
2

tutorials

Tutorials for the CLTK
Jupyter Notebook
52
star
3

cltk_frontend

Reading environment connecting to API from cltk/cltk_api repo
CSS
21
star
4

greek_lexica_perseus

Lexica and lemmata for the Ancient Greek language, from various sources
Python
19
star
5

sanskrit_text_gitasupersite

sanskrit monolingual corpus
Python
17
star
6

lat_text_latin_library

Collected files from thelatinlibrary.com
Python
17
star
7

sanskrit_text_dcs

Sanskrit Corpus
15
star
8

cltk_api

RESTful API for the CLTK
Python
13
star
9

annotations

A tool for annotating texts using Draft.js
JavaScript
13
star
10

greek_treebank_perseus

Greek treebank from the Perseus Digital Library
Python
12
star
11

latin_pos_lemmata_cltk

Python
11
star
12

grc_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
Python
9
star
13

sanskrit_parallel_sacred_texts

This Repository contains parallel Sanskrit and English Documents.
Python
9
star
14

grc_text_perseus

Collected Greek files from the Perseus Digital Library
Python
9
star
15

lat_text_perseus

Collected Latin files from the Perseus Digital Library
Python
8
star
16

lat_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
Python
8
star
17

sanskrit_parallel_gitasupersite

Parallel corpus
Python
7
star
18

grc_software_tlgu

Utility for converting TLG & PHI corpora to Unicode
C
7
star
19

latin_proper_names_cltk

A list of ~40K Classical Latin proper names
Python
7
star
20

sanskrit_text_wikisource

Python
6
star
21

cltk_docker

Docker script for cltk
Python
6
star
22

lat_text_tesserae

Plaintext files with Latin texts from the Tesserae Project
HTML
6
star
23

marathi_text_wikisource

Python
6
star
24

grc_text_tesserae

Plaintext files with Ancient Greek texts from the Tesserae Project
Jupyter Notebook
5
star
25

sanskrit_text_jnu

Sanskrit Corpora
5
star
26

greek_training_set_sentence_cltk

Training sets and tokenizer for the Classical Greek language, for use with CLTK
Python
5
star
27

telugu_text_wikisource

Classical Telugu texts from Wikisource
Python
5
star
28

latin_lexica_perseus

Lexica and lemmata for the Latin language, from various sources
Python
5
star
29

lapos

Fork of the Lookahead Part-Of-Speech (Lapos) Tagger
C++
5
star
30

ang_models_cltk

Python
4
star
31

greek_proper_names_cltk

A list of ~144K Classical Greek proper names
Python
4
star
32

sanskrit_text_sacred_texts

Sanskrit texts from sacred-texts.com
Python
4
star
33

punjabi_text_gurban

Punjabi Files of Gurbani
Python
4
star
34

greek_word2vec_cltk

Greek Word2Vec models
4
star
35

bengali_text_wikisource

Python
3
star
36

sanskrit_text_sanskrit_documents

Python
3
star
37

english_texts_wikisource

3
star
38

pali_text_ptr_tipitaka

Pali Tipitaka packaged with the Digital Pali Reader
JavaScript
3
star
39

latin_training_set_sentence_cltk

Training sets and tokenizer for the Latin language, for use with CLTK
Python
3
star
40

latin_treebank_perseus

Latin treebank from the Perseus Digital Library
Python
3
star
41

middle_english_text_cmepv

Texts from Corpus of Middle English Prose and Verse
Perl
2
star
42

arabic_morphology_quranic-corpus

2
star
43

old_norse_texts_heimskringla

Texts retrieved from Heimskrinla.no for easy use with cltk!
HTML
2
star
44

latin_word2vec_cltk

Latin Word2Vec models
2
star
45

greek_pos_edit_xenophon_anabasis

A human–editable version of a POS–tagged text of Xenophon's Anabasis
Python
2
star
46

old-norse-lemmatizer

Jupyter Notebook
2
star
47

sanskrit_pos_jnu_tagged

2
star
48

old_norse_text_perseus

Python
2
star
49

old_english_text_sacred_texts

HTML
2
star
50

old_norse_runes_corpus

Python
2
star
51

latin_text_corpus_grammaticorum_latinorum

Collected Latin Data from Corpus Grammaticorum Latinorum
2
star
52

alatinparser

ALP (A Latin Parser) is a syntactic parser for a small subset of classical Latin.
Prolog
2
star
53

hindi_text_ltrc

Corpus of Raw text for Classical Hindi
HTML
2
star
54

cltk.github.io

Static website for CLTK organization, built with Jekyll
SCSS
1
star
55

enm_models_cltk

Models for Middle English provided by CLTK
1
star
56

germanic_models_cltk

Python
1
star
57

cltk_grc_liddell_scott_intermediate

1
star
58

cltk_api_v2

Python
1
star
59

sql_db_quranic

This data base contains the Quran Holly Book
PLpgSQL
1
star
60

latin_text_poeti_ditalia

Corpus for Italian Poetry in Latin
HTML
1
star
61

gml_models_cltk

1
star
62

tibetan_pos_tdc

POS tagged corpora from Tibetan in Digital Communication
1
star
63

cltkv1

Experimental repo for new API CLTK
Python
1
star
64

pali_texts_gretil

Python
1
star
65

latin_text_lacus_curtius

Collected Latin files from LacusCurtius
HTML
1
star
66

san_models_cltk

Trained taggers, tokenizers, etc. for the CLTK
1
star
67

arabic_text_perseus

corpus for Classical arabic
1
star
68

french_text_wikisource

Collected texts from wikisource.org
1
star
69

old_church_slavonic_ccmh

Python
1
star
70

cltk_community_api

JavaScript
1
star
71

chinese_text_cbeta_02

Chinese Buddhist scriptures from CBETA
Python
1
star
72

escriptorium-deploy

Scripts to deploy the eScriptorium OCR system
Shell
1
star
73

greek_software_tlgu_python

A python wrapper for greek_software_tlgu
C
1
star
74

latin_text_antique_digiliblt

Antique Latin Corpus from digilibLT
1
star
75

prakrit_texts_gretil

HTML
1
star
76

capitains_text_corpora

Processed docs from capitains_corpora_converter
1
star
77

texts_server

Ruby
1
star