• Stars
    star
    520
  • Rank 84,640 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 7 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch

This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train and infer using them.

Using this code you can train:

  • Neural-machine-translation (NMT) models
  • Language models
  • Image to caption generation
  • Skip-thought sentence representations
  • And more...

Installation

git clone --recursive https://github.com/eladhoffer/seq2seq.pytorch
cd seq2seq.pytorch; python setup.py develop

Models

Models currently available:

Datasets

Datasets currently available:

All datasets can be tokenized using 3 available segmentation methods:

  • Character based segmentation
  • Word based segmentation
  • Byte-pair-encoding (BPE) as suggested by bpe with selectable number of tokens.

After choosing a tokenization method, a vocabulary will be generated and saved for future inference.

Training methods

The models can be trained using several methods:

  • Basic Seq2Seq - given encoded sequence, generate (decode) output sequence. Training is done with teacher-forcing.
  • Multi Seq2Seq - where several tasks (such as multiple languages) are trained simultaneously by using the data sequences as both input to the encoder and output for decoder.
  • Image2Seq - used to train image to caption generators.

Usage

Example training scripts are available in scripts folder. Inference examples are available in examples folder.

  • example for training a transformer on WMT16 according to original paper regime:
DATASET=${1:-"WMT16_de_en"}
DATASET_DIR=${2:-"./data/wmt16_de_en"}
OUTPUT_DIR=${3:-"./results"}

WARMUP="4000"
LR0="512**(-0.5)"

python main.py \
  --save transformer \
  --dataset ${DATASET} \
  --dataset-dir ${DATASET_DIR} \
  --results-dir ${OUTPUT_DIR} \
  --model Transformer \
  --model-config "{'num_layers': 6, 'hidden_size': 512, 'num_heads': 8, 'inner_linear': 2048}" \
  --data-config "{'moses_pretok': True, 'tokenization':'bpe', 'num_symbols':32000, 'shared_vocab':True}" \
  --b 128 \
  --max-length 100 \
  --device-ids 0 \
  --label-smoothing 0.1 \
  --trainer Seq2SeqTrainer \
  --optimization-config "[{'step_lambda':
                          \"lambda t: { \
                              'optimizer': 'Adam', \
                              'lr': ${LR0} * min(t ** -0.5, t * ${WARMUP} ** -1.5), \
                              'betas': (0.9, 0.98), 'eps':1e-9}\"
                          }]"
  • example for training attentional LSTM based model with 3 layers in both encoder and decoder:
python main.py \
  --save de_en_wmt17 \
  --dataset ${DATASET} \
  --dataset-dir ${DATASET_DIR} \
  --results-dir ${OUTPUT_DIR} \
  --model RecurrentAttentionSeq2Seq \
  --model-config "{'hidden_size': 512, 'dropout': 0.2, \
                   'tie_embedding': True, 'transfer_hidden': False, \
                   'encoder': {'num_layers': 3, 'bidirectional': True, 'num_bidirectional': 1, 'context_transform': 512}, \
                   'decoder': {'num_layers': 3, 'concat_attention': True,\
                               'attention': {'mode': 'dot_prod', 'dropout': 0, 'output_transform': True, 'output_nonlinearity': 'relu'}}}" \
  --data-config "{'moses_pretok': True, 'tokenization':'bpe', 'num_symbols':32000, 'shared_vocab':True}" \
  --b 128 \
  --max-length 80 \
  --device-ids 0 \
  --trainer Seq2SeqTrainer \
  --optimization-config "[{'epoch': 0, 'optimizer': 'Adam', 'lr': 1e-3},
                          {'epoch': 6, 'lr': 5e-4},
                          {'epoch': 8, 'lr':1e-4},
                          {'epoch': 10, 'lr': 5e-5},
                          {'epoch': 12, 'lr': 1e-5}]" \

More Repositories

1

convNet.pytorch

ConvNet training using pytorch
Python
346
star
2

quantized.pytorch

Python
212
star
3

TripletNet

Deep metric learning using Triplet network
Lua
189
star
4

bigBatch

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"
Python
148
star
5

captionGen

Generate captions for an image using PyTorch
Jupyter Notebook
128
star
6

ImageNet-Training

ImageNet training using torch
Lua
102
star
7

utils.pytorch

Utilities for Pytorch
Python
90
star
8

DeepDream.torch

Torch version for https://github.com/google/deepdream
Lua
53
star
9

fix_your_classifier

Python
34
star
10

recurrent.torch

Recurrent modules for Torch
Lua
27
star
11

lmdb.torch

LMDB for Torch
Lua
26
star
12

norm_matters

Python
23
star
13

SemiSupContrast

Semi-supervised deep learning by metric embedding
Lua
19
star
14

DeepLearningCourse

Deep learning mini-course given at Technion
Jupyter Notebook
18
star
15

convNet.torch

Convolutional network training using Torch
Lua
18
star
16

captionGeneration.torch

Generate captions for an image using convolutional and recurrent networks
Jupyter Notebook
12
star
17

eladtools

Lua
11
star
18

ConvNet-torch

Training Deep Convolutional Networks on visual classification tasks
Lua
11
star
19

GoogLeNet.torch

Trained network models for Torch
9
star
20

convNet.tf

Convolutional network training using TensorFlow
Python
8
star
21

stl10.torch

STL10 Dataset on Torch
Lua
3
star
22

DataProvider.torch

Data providers for Torch
Lua
3
star
23

eladhoffer.github.io

CSS
3
star
24

colab-notebooks

Jupyter Notebook
2
star
25

seq2seq.torch

Lua
1
star
26

DescriptorLearning

Lua
1
star