• Stars
    star
    4
  • Rank 3,303,837 (Top 66 %)
  • Language
  • License
    Other
  • Created about 9 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

More Repositories

1

docs

Universal Dependencies online documentation
HTML
247
star
2

tools

Various utilities for processing the data.
Python
192
star
3

UD_English-EWT

English data
Python
188
star
4

UD_Chinese-GSD

86
star
5

UD_Portuguese-Bosque

This Universal Dependencies (UD) Portuguese treebank.
Common Lisp
49
star
6

UD_Indonesian-GSD

Indonesian conversion
42
star
7

UD_Turkish-IMST

38
star
8

universaldependencies.github.io

Universal dependencies homepage
HTML
36
star
9

UD_Chinese-GSDSimp

Conversion of UD_Chinese-GSD to simplified Chinese characters.
35
star
10

UD_Japanese-GSD

Japanese data from the Google UDT 2.0.
34
star
11

UD_Vietnamese-VTB

34
star
12

UD_English-GUM

28
star
13

UD_Persian-Seraji

UD_Persian
26
star
14

UD_Ukrainian-IU

26
star
15

UD_Cantonese-HK

Spoken Cantonese from Hong Kong.
24
star
16

UD_Classical_Chinese-Kyoto

24
star
17

UD_Spanish-AnCora

Spanish data from the AnCora corpus.
23
star
18

UD_French-GSD

22
star
19

UD_Korean-GSD

Korean UD Treebank.
22
star
20

UD_English-ESL

English as a Second Language
Python
21
star
21

UD_Hindi-HDTB

21
star
22

UD_Portuguese-GSD

Brazilian Portuguese data from the Google Universal Dependency Treebanks 2.0.
Python
20
star
23

UD_Japanese-BCCWJ

Python
20
star
24

UD_Greek-GDT

UD Greek
18
star
25

UD_Korean-Kaist

Data from KAIST (a Korean treebank).
18
star
26

UD_Italian-ISDT

17
star
27

UD_Romanian-RRT

17
star
28

UD_Norwegian-Bokmaal

16
star
29

UD_Turkish-BOUN

16
star
30

UD_German-GSD

15
star
31

UD_Thai-PUD

Parallel Universal Dependencies.
14
star
32

UD_Russian-GSD

Shell
14
star
33

UD_Chinese-CFL

Chinese as a foreign language.
14
star
34

UD_Swedish-Talbanken

Swedish data
Python
12
star
35

UD_Russian-Taiga

11
star
36

UD_Bulgarian-BTB

Perl
11
star
37

UD_Spanish-GSD

11
star
38

UD_Polish-PDB

Polish data.
10
star
39

UD_Armenian-ArmTDP

Armenian data.
9
star
40

UD_German-HDT

9
star
41

UD_Hebrew-HTB

Hebrew Universal Dependencies Treebank
9
star
42

UD_Indonesian-PUD

Parallel Universal Dependencies.
9
star
43

UD_Hebrew-IAHLTwiki

9
star
44

UD_Sanskrit-UFAL

Sanskrit data.
8
star
45

UD_Chinese-HK

Spoken mandarin Chinese from Hong Kong.
8
star
46

UD_Dutch-Alpino

Dutch data.
8
star
47

UD_Tamil-TTB

Tamil data.
8
star
48

UD_French-Sequoia

Data from the Sequoia treebank.
8
star
49

UD_English-PUD

Parallel Universal Dependencies.
8
star
50

UD_Chinese-PUD

Parallel Universal Dependencies.
7
star
51

UD_Danish-DDT

7
star
52

UD_Finnish-TDT

Finnish data
7
star
53

cairo

Cairo CICLing Corpus – a multi-lingual parallel UD-style treebank of short sentences
Perl
7
star
54

UD_Nheengatu-CompLin

6
star
55

UD_English-Pronouns

6
star
56

UD_Irish-IDT

Irish data
Shell
6
star
57

UD_Amharic-ATT

Python
6
star
58

UD_Galician-TreeGal

6
star
59

UD_Polish-LFG

6
star
60

UD_Icelandic-Modern

Python
6
star
61

UD_Urdu-UDTB

6
star
62

UD_Portuguese-PUD

Parallel Universal Dependencies.
6
star
63

UD_Telugu-MTG

Telugu data.
6
star
64

UD_Pomak-Philotis

6
star
65

UD_Norwegian-NynorskLIA

5
star
66

UD_Latin-Perseus

5
star
67

UD_Turkish-Kenet

5
star
68

UD_Hindi_English-HIENCS

Python
5
star
69

UD_Uyghur-UDT

Uyghur data.
Perl
5
star
70

UD_Turkish-PUD

Parallel Universal Dependencies.
5
star
71

UD_Kazakh-KTB

5
star
72

UD_Estonian-EDT

Estonian data
5
star
73

UD_Latin-ITTB

Latin data from the Index Thomisticus Treebank.
5
star
74

UD_Welsh-CCG

4
star
75

UD_Latin-PROIEL

Latin data from the PROIEL treebank.
4
star
76

UD_Lithuanian-ALKSNIS

Lithuanian data from the Alksnis treebank.
4
star
77

UD_Catalan-AnCora

Catalan data from the AnCora corpus.
4
star
78

UD_Tagalog-TRG

Shell
4
star
79

UD_Wolof-WTB

4
star
80

UD_Kurmanji-MG

Northern Kurdish data.
4
star
81

UD_Yoruba-YTB

Shell
4
star
82

UD_Japanese-PUD

Parallel Universal Dependencies.
4
star
83

UD_Romanian-Nonstandard

4
star
84

UD_Serbian-SET

Serbian data.
4
star
85

UD_Old_Church_Slavonic-PROIEL

Old Church Slavonic data from the PROIEL project.
4
star
86

UD_Hungarian-Szeged

Hungarian data
Shell
4
star
87

UD_Dutch-LassySmall

Wikipedia sample from the Lassy Small treebank.
4
star
88

UD_French-FTB

Data from the French Treebank.
4
star
89

UD_English-ParTUT

English part of the ParTUT parallel treebank.
4
star
90

UD_Japanese-KTC

Perl 6
3
star
91

UD_Karelian-KKPP

3
star
92

UD_Japanese-GSDLUW

Long-unit-word version of UD_Japanese-GSD
3
star
93

UD_Faroese-FarPaHC

Python
3
star
94

UD_Japanese-Modern

3
star
95

UD_Tupinamba-TuDeT

3
star
96

UD_Norwegian-Nynorsk

Nynorsk version of the Norwegian Dependency Treebank.
3
star
97

UD_Icelandic-IcePaHC

Python
3
star
98

UD_Latin-LLCT

3
star
99

UD_Javanese-CSUI

3
star
100

UD_Icelandic-PUD

3
star