There are no reviews yet. Be the first to send feedback to the community and the maintainers!
whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translationneuralmonkey
An open-source tool for sequence learning in NLP built on TensorFlow.udpipe
UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U filesacl2019_nested_ner
Source code for paper Neural Architectures for Nested NER through Linearizationunilib
Embeddable C++17 Unicode library offering UTF encodings, general category info, simple and full casing, normalization forms, and combining marks stripping.morphodita
MorphoDiTa: Morphologic Dictionary and Taggerpublic-license-selector
Tool that will help you select the right open license for your data or softwareperin
PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020nametag
NameTag: Named Entity Taggermtmonkey
Distributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)treex
Treex NLP frameworknpfl114
Materials for the Deep Learning -- ÚFAL course NPFL114npfl129
NPFL129 repositorylindat-translation
Frontend of LINDAT translation servicefactgenie
Lightweight self-hosted span annotation toolaugpt
DSTC9 Submissionkorektor
Statistical spell- and (occasional) grammar-checker.npfl117
Deep Learning Seminar -- ÚFAL course NPFL117multilexnorm2021
MultiLexNorm 2021 competition system from ÚFALparsito
Parsito: Fast non-projective transition-based dependency parsernpfl122
NPFL122 repositorymicrorestd
MicroRestD is a small C++11 cross-platform REST server built on top of libmicrohttpd http://www.gnu.org/software/libmicrohttpd/.low-resource-gec-wnut2019
Source code for paper Grammatical Error Correction in Low-Resource Scenarios (W-NUT 2019)correctable-lecture-translator
A system for live lecture translation (speech to text) where the audience can easily provide corrections.olimpic-icdar24
Practical End-to-End Optical Music Recognition for Pianoform Musicpytreex
A minimal Python implementation of the Treex APIlinpipe
LinPipe: Multilingual Processing Toolnlgi_eval
NLI evaluation for NLGchu_liu_edmonds
Chu-Liu-Edmonds maximum spanning algorithm from TurboParser for use within Pythonmarian-tensorboard
a simple tool to parse marian training logs and display them in tensorboardsigmorphon2019
UFAL-Prague entry to the Sigmorphon 2019 Shared Task 2hamledt
Makefiles, scenarios and support scripts for the development of HamleDT within the Treex infrastructurewnut2021_character_transformations_gec
The code from the paper Character Transformations for Non-Autoregressive GEC Tagginglindat-repository-obsolete
LINDAT/CLARIN repository for linguistics (http://lindat.cz)charles-translator-web-frontend
Charles Translator: MT from Charles Universityclarin-sp-aaggregator
mrpipe-conll2019
ÚFAL MRPipe submission to CoNLL 2019 shared taskslimd
SliMD presentation system based on Markdown and HTML5&js.universal-segmentations
Build scripts for the UniSegments collection of morphologically segmented lexicons for many languagesUFAL_poster
Latex repository for a poster designbert-diacritics-restoration
Repository storing code and data for our paper "Diacritics Restoration using BERT with Analysis on Czech language".MLASK
EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"evalatin2024-latinpipe
LatinPipe – the winning entry to parsing task of EvaLatin 2024optimal-reference-translations
conll2017
CoNLL 2017 Shared Task Proposal: UD End-to-End parsingwiki-error-corpus
Scripts for extracting errors from Wikipedia revisionsweighteddist
A tiny toolkit for weighted word/character edit distance, including cost estimation.rg
ÚFAL Reading Groupthesis_info
ÚFAL Thesis Information Repositoryperl-pmltq
Query engine and query language for trees in PML formatrh_nntagging
Reading Hackathon -- NN Tagging Projectperl-pmltq-server
Refactored and simplified PMLTQ::CGIpcedt2.0-coref
Coreference extension to Prague Czech-English Dependency Treebank 2.0kazitext
corefud-scorer
Coreference and anaphora scorer for CorefUD dataquickjudge
A handy tool for quick manual evaluation of line-oriented outputs, e.g. of machine translation.teitok-tools
Conversion tools to and from the TEITOK TEI/XML formatconll2018
CoNLL 2018 UD Shared Taskcharles-translator-android
Android app of LINDAT translation servicecrac2023-corpipe
ÚFAL CorPipe: CRAC 2023 Winning System for Multilingual Coreference Resolutionqtleap
QTLeap Pilot MT systems using TectoMTPDT-C
Consolidated Czech PDT-style annotated corpus; consists of PDT, Czech part of PCEDT, PDTSC, PDT-Faustlindat-corpora-conversions
LINDAT Corpora Conversionslindat-aai-attributes
Parse shibboleth logs for important information about attributes from IdPs and otherufal-tools
deltacorpus
Delexicalized tagging and parsing.js-treex-view
Javascript library for visualizing Treex filesphd-thesis-template
A template PhD thesis at UFALcpp_builtem
C++ Builtem is a cross-platform Makefile-based build system for C++11ambiguity-grammaticality-complexity
Code for the paper Sentence Ambiguity, Grammaticality and Complexity Probeslindat-common
Common files and branding for Lindat projectscrac2022-corpipe
ÚFAL CorPipe: CRAC 2022 Winning System for Multilingual Coreference Resolutionlindat_piwik_reports
Cashing important counts from PIWIK periodically and creating customized reports for LINDAT/CLARINeyetracked-multi-modal-translation
EMMT (Eyetracked Multi-Modal Translation), a simultaneous eye-tracking, 4-electrode EEG and audio corpus for multi-modal reading and translation scenariosuk-cs-data-scripts
Scripts for processing data for Czech-Ukrainian MTerrant_czech
UFAL_MT_service
nametag3
NameTag3: Named Entity Taggermrptask
lindat-aai-discovery
pyclarindspace
Python package using clarin-dspace APIParCzech
ParCzech is a project on compiling Czech parliamentary data into annotated corpora.theaitrobot
THEaiTRE botauto-hume
Semantic MT metric trained on HUME annotationsnpfl101
Repository of the seminar NPFL101 Competing in Machine Translation.bilingual-abstracts-corpus
Bilingual corpus of scientific abstracts from ÚFAL Charles University publications.continuous-rating
tamiltb
nmt-pe-effects-2021
Experiment relating NMT quality and post-editing effortsMTEQA
cpp_utils
UFAL C++ Utilseuroparlmin
Corpus of European Parliament debates organized as a corpus for meeting summarization, i.e. matching full transcripts and minutes from the sessions. Used in the shared task of AutoMin 2023.pmltq-cgi
PMLTQ::CGI has been removed from PMLTQ module in order to decrease number of dependencies. It should be installed separately.SynSemClassSearch
ker
Simple Czech and English keyword extractornpfl087
NPFL087 Statistical Machine Translationdiaser
treex-web
Online interface for Treexwembedding_service
TF2 service for word embeddings computationNPFL095
web of the course "Modern Methods in Computational Linguistics"Love Open Source and this site? Check out how you can help us