• Stars
    star
    176
  • Rank 216,987 (Top 5 %)
  • Language
    Python
  • Created over 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Unsupervised Recurrent Neural Network Grammars

This is an implementation of the paper:
Unsupervised Recurrent Neural Network Grammars
Yoon Kim, Alexander Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gabor Melis
NAACL 2019

Dependencies

The code was tested in python 3.6 and pytorch 1.0.

Data

Sample train/val/test data is in the data/ folder. These are the standard datasets from PTB. First preprocess the data:

python preprocess.py --trainfile data/train.txt --valfile data/valid.txt --testfile data/test.txt 
--outputfile data/ptb --vocabminfreq 1 --lowercase 0 --replace_num 0 --batchsize 16

Running this will save the following files in the data/ folder: ptb-train.pkl, ptb-val.pkl, ptb-test.pkl, ptb.dict. Here ptb.dict is the word-idx mapping, and you can change the output folder/name by changing the argument to outputfile. Also, the preprocessing here will replace singletons with a single <unk> rather than with Berkeley parser's mapping rules (see below for results using this setup).

Training

To train the URNNG:

python train.py --train_file data/ptb-train.pkl --val_file data/ptb-val.pkl --save_path urnng.pt 
--mode unsupervised --gpu 0

where save_path is where you want to save the model, and gpu 0 is for using the first GPU in the cluster (the mapping from PyTorch GPU index to your cluster's GPU index may vary). Training should take 2 to 3 days depending on your setup.

To train the RNNG:

python train.py --train_file data/ptb-train.pkl --val_file data/ptb-val.pkl --save_path rnng.pt 
--mode supervised --train_q_epochs 18 --gpu 0 

For fine-tuning:

python train.py --train_from rnng.pt --train_file data/ptb-train.pkl --val_file data/ptb-val.pkl 
--save_path rnng-urnng.pt --mode unsupervised --lr 0.1 --train_q_epochs 10 --epochs 10 
--min_epochs 6 --gpu 0 --kl_warmup 0

To train the LM:

python train_lm.py --train_file data/ptb-train.pkl --val_file data/ptb-val.pkl 
--test_file data/ptb-test.pkl --save_path lm.pt 

Evaluation

To evaluate perplexity with importance sampling on the test set:

python eval_ppl.py --model_file urnng.pt --test_file data/ptb-test.pkl --samples 1000 
--is_temp 2 --gpu 0

The argument samples is for the number of importance weighted samples, and is_temp is for flattening the inference network's distribution (footnote 14 in the paper). The same evaluation code will work for RNNG.

For LM evaluation:

python train_lm.py --train_from lm.pt --test_file data/ptb-test.pkl --test 1

To evaluate F1, first we need to parse the test set:

python parse.py --model_file urnng.pt --data_file data/ptb-test.txt --out_file pred-parse.txt 
--gold_out_file gold-parse.txt --gpu 0

This will output the predicted parse trees into pred-parse.txt. We also output a version of the gold parse gold-parse.txt to be used as input for evalb, since sentences with only trivial spans are ignored by parse.py. Note that corpus/sentence F1 results printed here do not correspond to the results reported in the paper, since it does not ignore punctuation.

Finally, download/install evalb, available here. Then run:

evalb -p COLLINS.prm gold-parse.txt test-parse.txt

where COLLINS.prm is the parameter file (provided in this repo) that tells evalb to ignore punctuation and evaluate on unlabeled F1.

Note Regarding Preprocessing

Note that some of the details regarding the preprocessing is slightly different from the original paper. In particular, in this implementation we replace singleton words a single <unk> token instead of using Berkeley parser's mapping rules. This results in slight lower perplexity for all models, since the vocabulary size is smaller. Here are the perplexty numbers I get in this setting:

  • RNNLM: 89.2
  • RNNG: 83.7
  • URNNG: 85.1 (F1: 38.4)
  • RNNG --> URNNG: 82.5

Acknowledgements

Some of our preprocessing and evaluation code is based on the following repositories:

License

MIT

More Repositories

1

annotated-transformer

An annotated implementation of the Transformer paper.
Jupyter Notebook
5,683
star
2

seq2seq-attn

Sequence-to-sequence model with LSTM encoder/decoders and attention
Lua
1,257
star
3

im2markup

Neural model for converting Image-to-Markup (by Yuntian Deng yuntiandeng.com)
Lua
1,203
star
4

pytorch-struct

Fast, general, and tested differentiable structured prediction in PyTorch
Jupyter Notebook
1,107
star
5

sent-conv-torch

Text classification using a convolutional neural network.
Lua
448
star
6

namedtensor

Named Tensor implementation for Torch
Jupyter Notebook
443
star
7

var-attn

Latent Alignment and Variational Attention
Python
326
star
8

sent-summary

300
star
9

neural-template-gen

Python
262
star
10

struct-attn

Code for Structured Attention Networks https://arxiv.org/abs/1702.00887
Lua
237
star
11

NeuralSteganography

STEGASURAS: STEGanography via Arithmetic coding and Strong neURAl modelS
Python
183
star
12

botnet-detection

Topological botnet detection datasets and graph neural network applications
Python
169
star
13

data2text

Lua
158
star
14

sa-vae

Python
154
star
15

compound-pcfg

Python
127
star
16

cascaded-generation

Cascaded Text Generation with Markov Transformers
Python
127
star
17

TextFlow

Python
116
star
18

boxscore-data

HTML
111
star
19

decomp-attn

Decomposable Attention Model for Sentence Pair Classification (from https://arxiv.org/abs/1606.01933)
Lua
95
star
20

encoder-agnostic-adaptation

Encoder-Agnostic Adaptation for Conditional Language Generation
Python
79
star
21

genbmm

CUDA kernels for generalized matrix-multiplication in PyTorch
Jupyter Notebook
79
star
22

DeepLatentNLP

61
star
23

nmt-android

Neural Machine Translation on Android
Lua
59
star
24

BSO

Lua
54
star
25

hmm-lm

Python
42
star
26

seq2seq-talk

TeX
39
star
27

Talk-Latent

TeX
31
star
28

regulatory-prediction

Code and Data to accompany "Dilated Convolutions for Modeling Long-Distance Genomic Dependencies", presented at the ICML 2017 Workshop on Computational Biology
Python
28
star
29

harvardnlp.github.io

JavaScript
26
star
30

strux

Python
18
star
31

lie-access-memory

Lua
17
star
32

annotated-attention

Jupyter Notebook
15
star
33

DataModules

A state-less module system for torch-like languages
Python
8
star
34

rush-nlp

JavaScript
8
star
35

seq2seq-attn-web

CSS
8
star
36

tutorial-deep-latent

TeX
7
star
37

MemN2N

Torch implementation of End-to-End Memory Networks (https://arxiv.org/abs/1503.08895)
Lua
6
star
38

image-extraction

Extract images from PDFs
Jupyter Notebook
4
star
39

paper-explorer

JavaScript
3
star
40

readcomp

Entity Tracking Improves Cloze-style Reading Comprehension
Python
3
star
41

banded

Sparse banded diagonal matrices for pytorch
Cuda
2
star
42

torax

Python
2
star
43

cs6741

HTML
2
star
44

simple-recs

Python
1
star
45

poser

Python
1
star
46

iclr

1
star
47

cs6741-materials

1
star