• Stars
    star
    1
  • Language
  • Created over 3 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Collection of corpora built in the project Rich Context in Neural Machine Translation (2017-2020)

More Repositories

1

mbr

Minimum Bayes Risk Decoding for Hugging Face Transformers
Python
51
star
2

ContraDecode

The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding"
Python
33
star
3

xstance

A Multilingual Multi-Target Dataset for Stance Detection
Python
33
star
4

nmtscore

A library of translation-based text similarity measures
Python
25
star
5

swissbert

The multilingual language model for Switzerland
Jupyter Notebook
25
star
6

ContraPro

Contrastive evaluation of pronoun translation in neural machine translation
Perl
24
star
7

multilingual-instruction-tuning

Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"
Jupyter Notebook
23
star
8

coverage-contrastive-conditioning

Data and code accompanying the paper "As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning" (ACL 2022)
Python
20
star
9

ContraWSD

Word sense disambiguation test sets for NMT
Python
19
star
10

understanding-mbr

Shell
17
star
11

domain-robustness

Shell
12
star
12

segtest

A Test Suite for Morphological Phenomena in Neural Machine Translation
Shell
7
star
13

mtrain

Training automation for neural and statistical machine translation engines
Python
7
star
14

mbr-sensitivity

Data and code for the paper "Identifying Weaknesses in Machine Translation Metrics Through Minimum Bayes Risk Decoding: A Case Study for COMET"
Python
6
star
15

sdg_swisstext_2024_sharedtask

Repository for data and evaluation of 2024 Shared Task on SDG classification held by the Swiss Text Conference.
Python
5
star
16

BLESS

Code for the EMNLP 2023 paper "BLESS: Benchmarking Large Language Models on Sentence Simplification"
Jupyter Notebook
5
star
17

emnlp2018-imitation-learning-for-neural-morphology

Code for Paper "Imitation Learning for Neural Morphological String Transduction" by Peter Makarov and Simon Clematide. 2018. EMNLP
Python
4
star
18

monotonicity_loss

PLSQL
4
star
19

romanesco

Simple recurrent neural network (RNN) language model
Python
4
star
20

translation-direction-detection

Unsupervised translation direction detection using NMT systems
Python
4
star
21

mt-parity-assessment-data

experimental data for paper "A Set of Recommendations for Assessing Human–Machine Parity in Language Translation"
HTML
3
star
22

acl2020-historical-text-normalization

Code for the ACL 2020 paper "Semi-supervised Contextual Historical Text Normalization" by Peter Makarov and Simon Clematide
Python
3
star
23

contrastive-conditioning

Code and data accompanying the paper "Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias"
Python
3
star
24

coling2018-neural-transition-based-morphology

Code repository for COLING 2018 paper by Makarov and Clematide
Python
3
star
25

distil-lingeval

Data and code accompanying the paper "On the Limits of Minimal Pairs in Contrastive Evaluation"
Python
3
star
26

20Minuten

Jupyter Notebook
3
star
27

MultiPivotNMT

The implementation of "Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models"
Python
3
star
28

multilingual-lemma-disambiguation-gold-standard

A Multilingual Lemma Disambiguation Gold Standard for German, Finnish, French and Italian (as described in the MA thesis )
2
star
29

specific_hospo_respo

Code for hospitality review response generation
Jupyter Notebook
2
star
30

voting-booklet-bias

Code for the paper "Voting Booklet Bias: Stance Detection in Swiss Federal Communication"
Jupyter Notebook
2
star
31

recognizing-semantic-differences

Code for the paper "Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents"
Python
2
star
32

swiss-german-text-encoders

Code for the paper "Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect"
Python
2
star
33

RANLP2021-German-ATS

Shell
1
star
34

daikon

Simple encoder-decoder neural machine translation written in tensorflow
Python
1
star
35

SockUeye

Vue
1
star
36

RumantschCorpora

1
star
37

SockAPeye

Python
1
star
38

understanding-ctx-aug

Code for the 2023 ACL Findings paper, Uncovering Hidden Consequences of Pre-training Objectives in Sequence-to-Sequence Models (Kew & Sennrich, 2023)
Jupyter Notebook
1
star
39

romanisation-transfer

Code for the Paper "On Romanization for Model Transfer Between Scripts in Neural Machine Translation"
Mathematica
1
star