• Stars
    star
    86
  • Rank 371,692 (Top 8 %)
  • Language
    Shell
  • License
    Other
  • Created over 4 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

BERT model trained from scratch on Finnish

More Repositories

1

Turku-neural-parser-pipeline

A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.
Python
103
star
2

Finnish-dep-parser

The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
Python
49
star
3

wikibert

BERT models for many languages created from Wikipedia texts
31
star
4

Text_Mining_Course

Stuff for the Text Mining course
Jupyter Notebook
25
star
5

ocr-correction

Post-processing OCR errors with seq2seq models
Python
25
star
6

finngen-tools

Tools for training causal language models for Finnish
Python
25
star
7

Deep_Learning_in_LangTech_course

Materials for the University of Turku course TKO_8965 Deep Learning in Human Language Technology (previously named TKO_2101 Natural Language Processing)
Jupyter Notebook
14
star
8

bert-eval

Python
9
star
9

turku-ner-corpus

Open broad-coverage corpus for Finnish named entity recognition.
Python
9
star
10

turku-one

Turku OntoNotes Entities Corpus (TurkuONE)
8
star
11

pubmed_parses

Syntactic parses and named entity recognition for PubMed abstracts and PubMed Central full documents
8
star
12

finnish-generative-model-eval

Evaluation of Finnish generative models
Python
6
star
13

class-explainer

Python
5
star
14

IR_Course

Stuff for the upcoming IR course 2017
Jupyter Notebook
5
star
15

Finnish_PropBank

Finnish Proposition Bank
CSS
4
star
16

intro-to-nlp

Introduction to Natural Language Processing
Jupyter Notebook
4
star
17

register-labeling

Python
4
star
18

Turku-paraphrase-corpus

Python
3
star
19

biBERT

Finnish English bilingual BERT models
3
star
20

BINF_Programming

Stuff for the BINF programming course (@fginter)
Jupyter Notebook
3
star
21

ATP_kurssi

Jupyter Notebook
3
star
22

multilingual-register-labeling

Multilingual, multilabel modeling of registers
Python
3
star
23

CAFA3

University of Turku CAFA3 project
Python
3
star
24

conll17-system

Instructions for TurkuNLP system in CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies.
Shell
2
star
25

WAC-XII

Data presented in the paper "From Web Crawl to Clean Register-Annotated Corpora"
2
star
26

textual-data-analysis-course

Jupyter Notebook
2
star
27

DIKI1002-Working-with-Text-in-Python

Jupyter Notebook
2
star
28

FinCORE

Finnish Corpus of Online REgisters
Python
2
star
29

BioCreativeVI_BioID_assignment

Python
2
star
30

BioCreativeVI_CHEMPROT_RE

Deep learning-based systems for biomedical relation extraction: recognizing the statements of relations between chemical compounds/drugs and genes/proteins from biomedical literature. The code is developed for our participation in the BioCreative VI Task 5 (CHEMPROT) challenge. Contact: [email protected]
Python
2
star
31

Corpus-linguistics

Code and data for the examples and use cases described in the article "Määrällinen korpuslingvistiikka" to be published in the book "Kielentutkimuksen metodologian käsikirja" in Finnish.
Python
2
star
32

korona-tweets

stuff for our korona-tweets
Python
1
star
33

ocr_errors_simulator

Functions and codes used to determine probabilities on OCR errors and simulate them
Python
1
star
34

Digi_menetelmat

Johdatus digitaalisiin ihmistieteisiin -kurssin työpaja "Digitaaliset ihmistieteet kielentutkimuksessa: tekstinlouhinta"
Python
1
star
35

registerlabeling

Python
1
star
36

Cell-line-recognition

Cell line names recognition and normalization
CSS
1
star
37

Multilingual-register-corpora

French Corpus of Online REgisters (FreCORE) and Swedish Corpus of Online REgisters (SweCORE)
1
star
38

BHE

End-to-end System for Bacteria Habitat Extraction: Named-entity recognition (NER), named-entity normalization, relation extraction. email: [email protected]
Python
1
star
39

dolly-fi

Finnish version of databricks-dolly-15k instruction dataset
Python
1
star
40

sentiment-target-corpus

Targeted sentiment corpus
1
star
41

dep_search

JavaScript
1
star
42

deepfin-tools

DeepFin tools
Python
1
star
43

SRNNMT

Sentence representation for translation finding
Python
1
star
44

CORE-corpus

1
star
45

oasst-fi

Open Assistant dataset translated to Finnish
Python
1
star
46

DigiHum16

Random course notes for the DigiHum16 course
Jupyter Notebook
1
star
47

wikipedia-toxicity-data-fi

Python
1
star
48

toxicity-classifier

Repository for all things related to classifying whether a text is toxic or not using data from https://github.com/TurkuNLP/wikipedia-toxicity-data-fi
Python
1
star
49

PB_solr

Work towards indexing the Finnish Parsebank in SOLR
Python
1
star
50

TDT_editor

The tree editor used to annotate the Turku Dependency Treebank. Vintage code, but putting it online in case someone finds it in any way useful.
Python
1
star
51

pytorch-registerlabeling

Jupyter Notebook
1
star