There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Turku-neural-parser-pipeline
A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.FinBERT
BERT model trained from scratch on FinnishFinnish-dep-parser
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:wikibert
BERT models for many languages created from Wikipedia textsText_Mining_Course
Stuff for the Text Mining courseocr-correction
Post-processing OCR errors with seq2seq modelsfinngen-tools
Tools for training causal language models for FinnishDeep_Learning_in_LangTech_course
Materials for the University of Turku course TKO_8965 Deep Learning in Human Language Technology (previously named TKO_2101 Natural Language Processing)bert-eval
turku-ner-corpus
Open broad-coverage corpus for Finnish named entity recognition.turku-one
Turku OntoNotes Entities Corpus (TurkuONE)pubmed_parses
Syntactic parses and named entity recognition for PubMed abstracts and PubMed Central full documentsfinnish-generative-model-eval
Evaluation of Finnish generative modelsclass-explainer
IR_Course
Stuff for the upcoming IR course 2017Finnish_PropBank
Finnish Proposition Bankintro-to-nlp
Introduction to Natural Language Processingregister-labeling
RAG-web-app
Turku-paraphrase-corpus
biBERT
Finnish English bilingual BERT modelsBINF_Programming
Stuff for the BINF programming course (@fginter)ATP_kurssi
multilingual-register-labeling
Multilingual, multilabel modeling of registersCAFA3
University of Turku CAFA3 projectconll17-system
Instructions for TurkuNLP system in CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies.WAC-XII
Data presented in the paper "From Web Crawl to Clean Register-Annotated Corpora"DIKI1002-Working-with-Text-in-Python
FinCORE
Finnish Corpus of Online REgistersBioCreativeVI_CHEMPROT_RE
Deep learning-based systems for biomedical relation extraction: recognizing the statements of relations between chemical compounds/drugs and genes/proteins from biomedical literature. The code is developed for our participation in the BioCreative VI Task 5 (CHEMPROT) challenge. Contact: [email protected]BioCreativeVI_BioID_assignment
Corpus-linguistics
Code and data for the examples and use cases described in the article "Määrällinen korpuslingvistiikka" to be published in the book "Kielentutkimuksen metodologian käsikirja" in Finnish.korona-tweets
stuff for our korona-tweetsocr_errors_simulator
Functions and codes used to determine probabilities on OCR errors and simulate themregisterlabeling
Cell-line-recognition
Cell line names recognition and normalizationDigi_menetelmat
Johdatus digitaalisiin ihmistieteisiin -kurssin työpaja "Digitaaliset ihmistieteet kielentutkimuksessa: tekstinlouhinta"Multilingual-register-corpora
French Corpus of Online REgisters (FreCORE) and Swedish Corpus of Online REgisters (SweCORE)BHE
End-to-end System for Bacteria Habitat Extraction: Named-entity recognition (NER), named-entity normalization, relation extraction. email: [email protected]dolly-fi
Finnish version of databricks-dolly-15k instruction datasetsentiment-target-corpus
Targeted sentiment corpusdep_search
deepfin-tools
DeepFin toolsSRNNMT
Sentence representation for translation findingCORE-corpus
FinCORE_full
oasst-fi
Open Assistant dataset translated to FinnishDigiHum16
Random course notes for the DigiHum16 coursewikipedia-toxicity-data-fi
toxicity-classifier
Repository for all things related to classifying whether a text is toxic or not using data from https://github.com/TurkuNLP/wikipedia-toxicity-data-fiPB_solr
Work towards indexing the Finnish Parsebank in SOLRTDT_editor
The tree editor used to annotate the Turku Dependency Treebank. Vintage code, but putting it online in case someone finds it in any way useful.pytorch-registerlabeling
Love Open Source and this site? Check out how you can help us