clarinsi/classla-spoken

Stars
1
Language
Shell
Created about 2 years ago
Updated almost 2 years ago

clarinsi/classla-spoken

clarinsi

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

reldi-tagger

A tagger and lemmatiser for Croatian, Serbian and Slovene.

geobert

csmtiser

A tool for text normalisation via character-level machine translation

tweetcat

TweetCaT - a tool for building Twitter corpora of smaller languages or specific geographical regions

reldi-lib

vejice

megahr-crossling

Predictions on concreteness and imageability of words in 77 languages

Slovene_ASR_e2e

Automatic Speech Recognition tool

reldi-tokeniser

A two-mode (standard, nonstandard) tokeniser for South Slavic languages

mte-msd

MULTEXT-East morphosyntactic specifications

janes-ner

NER system for South Slavic languages

redi

Diacritic restoration tool for Croatian, Serbian and Slovene

babushka-bench

Benchmarking NLP tools on Slovene, Croatian and Serbian

tweetgeo

A Tool for Collecting, Visualising and Inferring from Geo-encoded Linguistic Data

TEI-schema

Recommended TEI schema for CLARIN.SI resources, cf. also https://clarinsi.github.io/TEI-schema/

parlaspeech

Code for bootstrapping ASR datasets from parliamentary recordings and transcripts

Jupyter Notebook

reldi-api

Slovene_NMT

Neural Machine Translation tool

slovene_syllable_splitter

A rule-based syllable splitter for Slovene that takes an input word and returns a list of syllables in the word, e.g. predsedovati -> ['pred', 'se', 'do', 'va', 'ti']; decembrskega -> ['de', 'cem', 'brs', 'ke', 'ga'].

reldi-depparse

jos2ud

cordex

Obeliks4J

wikitalk-extractor

A corpus extractor from the Wikipedia page and user talk pages

benchich

BENCHić - the benchmark for Bosnian, Croatian, Montenegrin, Serbian (and friends)

sb-abbr

NLP dataset of the Slovenian Biography

drevesnik

Web portal for searching and displaying syntacically annotated corpora