There are no reviews yet. Be the first to send feedback to the community and the maintainers!
whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translationneuralmonkey
An open-source tool for sequence learning in NLP built on TensorFlow.udpipe
UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U filesacl2019_nested_ner
Source code for paper Neural Architectures for Nested NER through Linearizationunilib
Embeddable C++17 Unicode library offering UTF encodings, general category info, simple and full casing, normalization forms, and combining marks stripping.morphodita
MorphoDiTa: Morphologic Dictionary and Taggerpublic-license-selector
Tool that will help you select the right open license for your data or softwareperin
PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020nametag
NameTag: Named Entity Taggermtmonkey
Distributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)treex
Treex NLP frameworknpfl114
Materials for the Deep Learning -- ÚFAL course NPFL114npfl129
NPFL129 repositorylindat-translation
Frontend of LINDAT translation servicefactgenie
Lightweight self-hosted span annotation toolaugpt
DSTC9 Submissionkorektor
Statistical spell- and (occasional) grammar-checker.npfl117
Deep Learning Seminar -- ÚFAL course NPFL117multilexnorm2021
MultiLexNorm 2021 competition system from ÚFALparsito
Parsito: Fast non-projective transition-based dependency parsernpfl122
NPFL122 repositorymicrorestd
MicroRestD is a small C++11 cross-platform REST server built on top of libmicrohttpd http://www.gnu.org/software/libmicrohttpd/.low-resource-gec-wnut2019
Source code for paper Grammatical Error Correction in Low-Resource Scenarios (W-NUT 2019)correctable-lecture-translator
A system for live lecture translation (speech to text) where the audience can easily provide corrections.olimpic-icdar24
Practical End-to-End Optical Music Recognition for Pianoform Musicpytreex
A minimal Python implementation of the Treex APIlinpipe
LinPipe: Multilingual Processing Toolnlgi_eval
NLI evaluation for NLGchu_liu_edmonds
Chu-Liu-Edmonds maximum spanning algorithm from TurboParser for use within Pythonmarian-tensorboard
a simple tool to parse marian training logs and display them in tensorboardsigmorphon2019
UFAL-Prague entry to the Sigmorphon 2019 Shared Task 2hamledt
Makefiles, scenarios and support scripts for the development of HamleDT within the Treex infrastructurelindat-repository-obsolete
LINDAT/CLARIN repository for linguistics (http://lindat.cz)charles-translator-web-frontend
Charles Translator: MT from Charles Universityclarin-sp-aaggregator
mrpipe-conll2019
ÚFAL MRPipe submission to CoNLL 2019 shared taskslimd
SliMD presentation system based on Markdown and HTML5&js.universal-segmentations
Build scripts for the UniSegments collection of morphologically segmented lexicons for many languagesUFAL_poster
Latex repository for a poster designbert-diacritics-restoration
Repository storing code and data for our paper "Diacritics Restoration using BERT with Analysis on Czech language".MLASK
EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"evalatin2024-latinpipe
LatinPipe – the winning entry to parsing task of EvaLatin 2024optimal-reference-translations
conll2017
CoNLL 2017 Shared Task Proposal: UD End-to-End parsingwiki-error-corpus
Scripts for extracting errors from Wikipedia revisionsweighteddist
A tiny toolkit for weighted word/character edit distance, including cost estimation.rg
ÚFAL Reading Groupthesis_info
ÚFAL Thesis Information Repositoryperl-pmltq
Query engine and query language for trees in PML formatrh_nntagging
Reading Hackathon -- NN Tagging Projectperl-pmltq-server
Refactored and simplified PMLTQ::CGIpcedt2.0-coref
Coreference extension to Prague Czech-English Dependency Treebank 2.0kazitext
corefud-scorer
Coreference and anaphora scorer for CorefUD dataquickjudge
A handy tool for quick manual evaluation of line-oriented outputs, e.g. of machine translation.teitok-tools
Conversion tools to and from the TEITOK TEI/XML formatconll2018
CoNLL 2018 UD Shared Taskcharles-translator-android
Android app of LINDAT translation servicecrac2023-corpipe
ÚFAL CorPipe: CRAC 2023 Winning System for Multilingual Coreference Resolutionqtleap
QTLeap Pilot MT systems using TectoMTPDT-C
Consolidated Czech PDT-style annotated corpus; consists of PDT, Czech part of PCEDT, PDTSC, PDT-Faustlindat-corpora-conversions
LINDAT Corpora Conversionslindat-aai-attributes
Parse shibboleth logs for important information about attributes from IdPs and otherufal-tools
deltacorpus
Delexicalized tagging and parsing.js-treex-view
Javascript library for visualizing Treex filesphd-thesis-template
A template PhD thesis at UFALcpp_builtem
C++ Builtem is a cross-platform Makefile-based build system for C++11ambiguity-grammaticality-complexity
Code for the paper Sentence Ambiguity, Grammaticality and Complexity Probeslindat-common
Common files and branding for Lindat projectscrac2022-corpipe
ÚFAL CorPipe: CRAC 2022 Winning System for Multilingual Coreference Resolutionlindat_piwik_reports
Cashing important counts from PIWIK periodically and creating customized reports for LINDAT/CLARINeyetracked-multi-modal-translation
EMMT (Eyetracked Multi-Modal Translation), a simultaneous eye-tracking, 4-electrode EEG and audio corpus for multi-modal reading and translation scenariosuk-cs-data-scripts
Scripts for processing data for Czech-Ukrainian MTerrant_czech
UFAL_MT_service
nametag3
NameTag3: Named Entity Taggermrptask
lindat-aai-discovery
pyclarindspace
Python package using clarin-dspace APIParCzech
ParCzech is a project on compiling Czech parliamentary data into annotated corpora.theaitrobot
THEaiTRE botauto-hume
Semantic MT metric trained on HUME annotationsnpfl101
Repository of the seminar NPFL101 Competing in Machine Translation.bilingual-abstracts-corpus
Bilingual corpus of scientific abstracts from ÚFAL Charles University publications.continuous-rating
tamiltb
nmt-pe-effects-2021
Experiment relating NMT quality and post-editing effortsMTEQA
cpp_utils
UFAL C++ Utilseuroparlmin
Corpus of European Parliament debates organized as a corpus for meeting summarization, i.e. matching full transcripts and minutes from the sessions. Used in the shared task of AutoMin 2023.pmltq-cgi
PMLTQ::CGI has been removed from PMLTQ module in order to decrease number of dependencies. It should be installed separately.qsubmit
A wrapper over various grid submission scriptsSynSemClassSearch
ker
Simple Czech and English keyword extractornpfl087
NPFL087 Statistical Machine Translationdiaser
treex-web
Online interface for Treexwembedding_service
TF2 service for word embeddings computationNPFL095
web of the course "Modern Methods in Computational Linguistics"Love Open Source and this site? Check out how you can help us