There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Turku-neural-parser-pipeline
A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.FinBERT
BERT model trained from scratch on FinnishFinnish-dep-parser
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:wikibert
BERT models for many languages created from Wikipedia textsText_Mining_Course
Stuff for the Text Mining courseocr-correction
Post-processing OCR errors with seq2seq modelsfinngen-tools
Tools for training causal language models for FinnishDeep_Learning_in_LangTech_course
Materials for the University of Turku course TKO_8965 Deep Learning in Human Language Technology (previously named TKO_2101 Natural Language Processing)bert-eval
turku-ner-corpus
Open broad-coverage corpus for Finnish named entity recognition.turku-one
Turku OntoNotes Entities Corpus (TurkuONE)pubmed_parses
Syntactic parses and named entity recognition for PubMed abstracts and PubMed Central full documentsfinnish-generative-model-eval
Evaluation of Finnish generative modelsclass-explainer
IR_Course
Stuff for the upcoming IR course 2017Finnish_PropBank
Finnish Proposition Bankintro-to-nlp
Introduction to Natural Language Processingregister-labeling
Turku-paraphrase-corpus
biBERT
Finnish English bilingual BERT modelsBINF_Programming
Stuff for the BINF programming course (@fginter)ATP_kurssi
multilingual-register-labeling
Multilingual, multilabel modeling of registersCAFA3
University of Turku CAFA3 projectconll17-system
Instructions for TurkuNLP system in CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies.WAC-XII
Data presented in the paper "From Web Crawl to Clean Register-Annotated Corpora"textual-data-analysis-course
DIKI1002-Working-with-Text-in-Python
FinCORE
Finnish Corpus of Online REgistersBioCreativeVI_BioID_assignment
BioCreativeVI_CHEMPROT_RE
Deep learning-based systems for biomedical relation extraction: recognizing the statements of relations between chemical compounds/drugs and genes/proteins from biomedical literature. The code is developed for our participation in the BioCreative VI Task 5 (CHEMPROT) challenge. Contact: [email protected]Corpus-linguistics
Code and data for the examples and use cases described in the article "Määrällinen korpuslingvistiikka" to be published in the book "Kielentutkimuksen metodologian käsikirja" in Finnish.korona-tweets
stuff for our korona-tweetsocr_errors_simulator
Functions and codes used to determine probabilities on OCR errors and simulate themDigi_menetelmat
Johdatus digitaalisiin ihmistieteisiin -kurssin työpaja "Digitaaliset ihmistieteet kielentutkimuksessa: tekstinlouhinta"registerlabeling
Cell-line-recognition
Cell line names recognition and normalizationMultilingual-register-corpora
French Corpus of Online REgisters (FreCORE) and Swedish Corpus of Online REgisters (SweCORE)BHE
End-to-end System for Bacteria Habitat Extraction: Named-entity recognition (NER), named-entity normalization, relation extraction. email: [email protected]dolly-fi
Finnish version of databricks-dolly-15k instruction datasetsentiment-target-corpus
Targeted sentiment corpusdep_search
deepfin-tools
DeepFin toolsSRNNMT
Sentence representation for translation findingCORE-corpus
oasst-fi
Open Assistant dataset translated to FinnishDigiHum16
Random course notes for the DigiHum16 coursewikipedia-toxicity-data-fi
toxicity-classifier
Repository for all things related to classifying whether a text is toxic or not using data from https://github.com/TurkuNLP/wikipedia-toxicity-data-fiPB_solr
Work towards indexing the Finnish Parsebank in SOLRTDT_editor
The tree editor used to annotate the Turku Dependency Treebank. Vintage code, but putting it online in case someone finds it in any way useful.Love Open Source and this site? Check out how you can help us