Graham Neubig (@neubig)

Top repositories

1

nn4nlp-code

Code Samples from Neural Networks for NLP
Python
1,303
star
2

lowresource-nlp-bootcamp-2020

The website for the CMU Language Technologies Institute low resource NLP bootcamp 2020
Jupyter Notebook
598
star
3

nlptutorial

A Tutorial about Programming for Natural Language Processing
Perl
423
star
4

nmt-tips

A tutorial about neural machine translation including tips on building practical systems
Perl
368
star
5

kytea

The Kyoto Text Analysis Toolkit for word segmentation and pronunciation estimation, etc.
C++
197
star
6

nlp-from-scratch-assignment-2022

An assignment for CMU CS11-711 Advanced NLP, building NLP systems from scratch
Python
168
star
7

lamtram

lamtram: A toolkit for neural language and translation modeling
C++
138
star
8

anlp-code

Jupyter Notebook
130
star
9

research-career-tools

Python
128
star
10

naacl18tutorial

NAACL 2018 Tutorial: Modelling Natural Language, Programs, and their Intersection
TeX
102
star
11

minbert-assignment

Minimalist BERT implementation assignment for CS11-711
Python
70
star
12

minnn-assignment

An assignment on creating a minimalist neural network toolkit for CS11-747
Python
64
star
13

yrsnlp-2016

Structured Neural Networks for NLP: From Idea to Code
Jupyter Notebook
59
star
14

minllama-assignment

Python
48
star
15

util-scripts

Various utility scripts useful for natural language processing, machine translation, etc.
Perl
46
star
16

latticelm

Software for unsupervised word segmentation and language model learning using lattices
C++
45
star
17

coderx

A highly sophisticated sequence-to-sequence model for code generation
Python
40
star
18

rapid-adaptation

Reproduction instructions for "Rapid Adaptation of Neural Machine Translation to New Languages"
Shell
39
star
19

mtandseq2seq-code

Code examples for CMU CS11-731, Machine Translation and Sequence-to-sequence Models
Python
33
star
20

travatar

This is a repository for the Travatar forest-to-string translation decoder
C++
28
star
21

lxmls-2017

Slides/code for the Lisbon machine learning school 2017
Python
28
star
22

modlm

modlm: A toolkit for mixture of distributions language models
C++
27
star
23

kylm

The Kyoyo Language Modeling Toolkit
Java
27
star
24

pialign

pialign - A Phrasal ITG Aligner
C++
23
star
25

pgibbs

An implementation of parallel gibbs sampling for word segmentation and POS tagging.
C++
16
star
26

nlp-from-scratch-assignment-spring2024

An assignment for building an NLP system from scratch.
16
star
27

lader

A reordering tool for machine translation.
C++
15
star
28

howtocode-2017

An example of DyNet autobatching for the NIPS "how to code a paper" workshop
Jupyter Notebook
13
star
29

kyfd

A decoder for finite state models for text processing.
C++
12
star
30

egret

A fork of the Egret parser that fixes a few bugs
C++
10
star
31

latticelm-v2

Second version of latticelm, a tool for learning language models from lattices
C++
7
star
32

globalutility

TeX
6
star
33

nafil

A program for performing bilingual corpus filtering
C++
4
star
34

prontron

A discriminative pronunciation estimator using the structured perceptron algorithm.
Perl
4
star
35

wat2014

Scripts for creating a system similar to the NAIST submission to WAT2014
Shell
3
star
36

multi-extract

A script for extracting multi-synchronous context-free grammars
Python
2
star
37

nile

A clone of the nile alignment toolkit
C++
1
star
38

webigator

A program to aggregate, rank, and search text information
Perl
1
star
39

ribes-c

A C++ implementation of the RIBES machine translation evaluation measure.
C++
1
star
40

swe-bench-zeno

Scripts for analyzing swe-bench with Zeno
Python
1
star