• Stars
    star
    1
  • Language
  • Created over 3 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

French Corpus of Online REgisters (FreCORE) and Swedish Corpus of Online REgisters (SweCORE)

More Repositories

1

Turku-neural-parser-pipeline

A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.
Python
103
star
2

FinBERT

BERT model trained from scratch on Finnish
Shell
86
star
3

Finnish-dep-parser

The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
Python
49
star
4

wikibert

BERT models for many languages created from Wikipedia texts
31
star
5

Text_Mining_Course

Stuff for the Text Mining course
Jupyter Notebook
25
star
6

ocr-correction

Post-processing OCR errors with seq2seq models
Python
25
star
7

finngen-tools

Tools for training causal language models for Finnish
Python
25
star
8

Deep_Learning_in_LangTech_course

Materials for the University of Turku course TKO_8965 Deep Learning in Human Language Technology (previously named TKO_2101 Natural Language Processing)
Jupyter Notebook
14
star
9

bert-eval

Python
9
star
10

turku-ner-corpus

Open broad-coverage corpus for Finnish named entity recognition.
Python
9
star
11

turku-one

Turku OntoNotes Entities Corpus (TurkuONE)
8
star
12

pubmed_parses

Syntactic parses and named entity recognition for PubMed abstracts and PubMed Central full documents
8
star
13

finnish-generative-model-eval

Evaluation of Finnish generative models
Python
6
star
14

class-explainer

Python
5
star
15

IR_Course

Stuff for the upcoming IR course 2017
Jupyter Notebook
5
star
16

Finnish_PropBank

Finnish Proposition Bank
CSS
4
star
17

intro-to-nlp

Introduction to Natural Language Processing
Jupyter Notebook
4
star
18

register-labeling

Python
4
star
19

Turku-paraphrase-corpus

Python
3
star
20

biBERT

Finnish English bilingual BERT models
3
star
21

BINF_Programming

Stuff for the BINF programming course (@fginter)
Jupyter Notebook
3
star
22

ATP_kurssi

Jupyter Notebook
3
star
23

CAFA3

University of Turku CAFA3 project
Python
3
star
24

conll17-system

Instructions for TurkuNLP system in CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies.
Shell
2
star
25

WAC-XII

Data presented in the paper "From Web Crawl to Clean Register-Annotated Corpora"
2
star
26

textual-data-analysis-course

Jupyter Notebook
2
star
27

DIKI1002-Working-with-Text-in-Python

Jupyter Notebook
2
star
28

multilingual-register-labeling

Multilingual, multilabel modeling of registers
Python
2
star
29

FinCORE

Finnish Corpus of Online REgisters
Python
2
star
30

BioCreativeVI_BioID_assignment

Python
2
star
31

BioCreativeVI_CHEMPROT_RE

Deep learning-based systems for biomedical relation extraction: recognizing the statements of relations between chemical compounds/drugs and genes/proteins from biomedical literature. The code is developed for our participation in the BioCreative VI Task 5 (CHEMPROT) challenge. Contact: [email protected]
Python
2
star
32

Corpus-linguistics

Code and data for the examples and use cases described in the article "Määrällinen korpuslingvistiikka" to be published in the book "Kielentutkimuksen metodologian käsikirja" in Finnish.
Python
2
star
33

korona-tweets

stuff for our korona-tweets
Python
1
star
34

ocr_errors_simulator

Functions and codes used to determine probabilities on OCR errors and simulate them
Python
1
star
35

Digi_menetelmat

Johdatus digitaalisiin ihmistieteisiin -kurssin työpaja "Digitaaliset ihmistieteet kielentutkimuksessa: tekstinlouhinta"
Python
1
star
36

registerlabeling

Python
1
star
37

Cell-line-recognition

Cell line names recognition and normalization
CSS
1
star
38

BHE

End-to-end System for Bacteria Habitat Extraction: Named-entity recognition (NER), named-entity normalization, relation extraction. email: [email protected]
Python
1
star
39

dolly-fi

Finnish version of databricks-dolly-15k instruction dataset
Python
1
star
40

sentiment-target-corpus

Targeted sentiment corpus
1
star
41

dep_search

JavaScript
1
star
42

deepfin-tools

DeepFin tools
Python
1
star
43

SRNNMT

Sentence representation for translation finding
Python
1
star
44

CORE-corpus

1
star
45

oasst-fi

Open Assistant dataset translated to Finnish
Python
1
star
46

DigiHum16

Random course notes for the DigiHum16 course
Jupyter Notebook
1
star
47

wikipedia-toxicity-data-fi

Python
1
star
48

toxicity-classifier

Repository for all things related to classifying whether a text is toxic or not using data from https://github.com/TurkuNLP/wikipedia-toxicity-data-fi
Python
1
star
49

PB_solr

Work towards indexing the Finnish Parsebank in SOLR
Python
1
star
50

TDT_editor

The tree editor used to annotate the Turku Dependency Treebank. Vintage code, but putting it online in case someone finds it in any way useful.
Python
1
star
51

pytorch-registerlabeling

Jupyter Notebook
1
star