• Stars
    star
    6
  • Rank 2,478,263 (Top 50 %)
  • Language
    C
  • Created about 6 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Predictions on concreteness and imageability of words in 77 languages

More Repositories

1

reldi-tagger

A tagger and lemmatiser for Croatian, Serbian and Slovene.
Python
32
star
2

csmtiser

A tool for text normalisation via character-level machine translation
Python
13
star
3

tweetcat

TweetCaT - a tool for building Twitter corpora of smaller languages or specific geographical regions
Python
12
star
4

geobert

Python
12
star
5

reldi-lib

Python
9
star
6

vejice

Python
6
star
7

Slovene_ASR_e2e

Automatic Speech Recognition tool
Python
6
star
8

reldi-tokeniser

A two-mode (standard, nonstandard) tokeniser for South Slavic languages
Python
5
star
9

mte-msd

MULTEXT-East morphosyntactic specifications
HTML
5
star
10

janes-ner

NER system for South Slavic languages
Python
4
star
11

redi

Diacritic restoration tool for Croatian, Serbian and Slovene
Python
4
star
12

babushka-bench

Benchmarking NLP tools on Slovene, Croatian and Serbian
Python
4
star
13

tweetgeo

A Tool for Collecting, Visualising and Inferring from Geo-encoded Linguistic Data
Python
3
star
14

TEI-schema

Recommended TEI schema for CLARIN.SI resources, cf. also https://clarinsi.github.io/TEI-schema/
XSLT
2
star
15

reldi-api

Python
2
star
16

parlaspeech

Code for bootstrapping ASR datasets from parliamentary recordings and transcripts
Jupyter Notebook
2
star
17

Slovene_NMT

Neural Machine Translation tool
Python
2
star
18

slovene_syllable_splitter

A rule-based syllable splitter for Slovene that takes an input word and returns a list of syllables in the word, e.g. predsedovati -> ['pred', 'se', 'do', 'va', 'ti']; decembrskega -> ['de', 'cem', 'brs', 'ke', 'ga'].
Python
2
star
19

reldi-depparse

HTML
1
star
20

classla-spoken

Shell
1
star
21

jos2ud

Perl
1
star
22

cordex

Python
1
star
23

Obeliks4J

Java
1
star
24

wikitalk-extractor

A corpus extractor from the Wikipedia page and user talk pages
Python
1
star
25

benchich

BENCHić - the benchmark for Bosnian, Croatian, Montenegrin, Serbian (and friends)
Python
1
star
26

sb-abbr

NLP dataset of the Slovenian Biography
XSLT
1
star
27

drevesnik

Web portal for searching and displaying syntacically annotated corpora
JavaScript
1
star