Seq2Seq-Vis
A visual debugging tool for Sequence-to-Sequence models
*by IBM Research in Cambridge and Harvard SEAS -- more info seq2seq-vis.io
- Seq2Seq-Vis
- Cite us
- Contributors
- License
conda
Install and run with We require using miniconda to create a virtual environment and install all dependencies via scripts. Seq2Seq-Vis currently works with a special version of OpenNMT-py modified version by Sebastian Gehrmann. We provide a script to install this special branch.
after installation you should have a file structure like this:
MyS2S/Seq2Seq-Vis ==> the tool
MyS2S/Seq2Seq-Vis/0316-fakedates/ ==> example data
MyS2S/OpenNMT-py ==> modified OpenNMT
1 - Install dependencies (server and client) and create virtual environment
create root directory (MyS2S
)and then:
git clone https://github.com/HendrikStrobelt/Seq2Seq-Vis.git
cd Seq2Seq-Vis
and run in /Seq2Seq-Vis
:
source setup_cpu.sh
2 - Install custom OpenNMT-py version
return to root directory:
cd ..
source Seq2Seq-Vis/setup_onmt_custom.sh
3 - Download some example data
Here we provide some example data for a character based dataset which converts date strings (e.g. "March 03, 1999" , "03/03/99") into a base form "mm-dd-yyyy". Download here ~177MB save it to /Seq2Seq-Vis
and unzip:
unzip fakedates.zip
4 - Run the system
python3 server.py --dir 0316-fakedates/
go here: http://localhost:8080/client/index.html?in=M a r c h _ 0 3 , 1 9 9 9
You should see:
Enjoy exploring !
docker
Install and run with Thanks, Samuel Gratzl for contributing a docker configuration and image. Here are the steps:
- pull image:
docker pull sgratzl/seq2seq-vis
- download data Download here ~177MB
and unzip:
unzip fakedates.zip
- run container with bound data:
docker run --rm -it -v "${PWD}/0316-fakedates:/data" -p "8080:8080" sgratzl/seq2seq-vis
Prepare and run own models
1 - Prepare your data
You can use any model trained with OpenNMT-py to extract your own data. To gain access to the extraction scripts, follow the instructions above to install the modified OpenNMT-py version.
First, create a folder s2s
that will be used to save all the extractions by calling mkdir s2s
.
Then, call
python extract_context.py -src $your_input_file \
-tgt $your_target_file \
-model $your_model.pt \
-gpu $your_GPU_id (can be ignored for CPU extraction) \
-batch_size $your_batch_size
You can customize the maximum sequence lengths by setting max_src_len
, and max_tgt_len
in the script. If you want to restrict the number of examples in your state file, you can uncomment the following lines and set it to your desrired size:
# if bcounter > 100:
# break
The script creates a file in the location s2s/states.h5
. This file is what you need to create the indices for searching.
The file for this is located in this directory in scripts/h5_to_faiss.py
.
Call it three times (once for each type of state) with the parameters
-states s2s/states.h5 # Your states file location
-data [decoder_out, encoder_out, cstar] # The three datasets within the states h5 file
-output $your_index_name # We recommend just naming them decoder.faiss, encoder.faiss, and context.faiss
-stepsize 100 # you can increase this, this is the number of batches it will add to the index at once. It is bottlenecked by your memory
To generate the dictionary and embedding files, modify this line with the location of your model and call
python VisServer.py
This will also test that your model works with our server as it calls the same API. The script will create three files:
- s2s/embs.h5
- s2s/src.dict
- s2s/tgt.dict
s2s.yaml
file to describe project
2 - Create a # -- minimal config
model: date_acc_100.00_ppl_1.00_e7.pt # model file
dicts:
src: src.dict # source dictionary file
tgt: tgt.dict # target dictionary file
embeddings: embs.h5 # word embeddings for src and tgt
train: train.h5 # training data
# -- OPTIONAL: FAISS indices for Neighborhoods
indexType: faiss # index type should be 'faiss' (or 'annoy')
indices:
decoder: decoder.faiss # index for decoder states
encoder: encoder.faiss # index for encoder states
# -- OPTIONAL: model for linear projection
project_model: linear_projection.pkl # pickl-ed scikit-learn model
3 - Command Line Parameters
usage: server.py [-h] [--nodebug NODEBUG] [--port PORT]
[-dir DIR]
optional arguments:
--nodebug TRUE if not in debug mode
--port port to run system (default: 8080)
--dir directory with s2s.yaml file
Cite us
@ARTICLE{seq2seqvisv1,
author = {{Strobelt}, H. and {Gehrmann}, S. and {Behrisch}, M. and {Perer}, A. and {Pfister}, H. and {Rush}, A.~M.},
title = "{Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models}",
journal = {ArXiv e-prints},
archivePrefix = "arXiv",
eprint = {1804.09299v1},
primaryClass = "cs.CL",
keywords = {Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing},
year = 2018,
month = April
}
Contributors
-
Hendrik Strobelt (IBM Research & MIT-IBM Watson AI Lab)
-
Sebastian Gehrmann (Harvard NLP)
-
Alexander M. Rush (Harvard NLP)
-
Michael Behrisch (Harvard VCG), Adam Perer (IBM Research), Hanspeter Pfister (Harvard VCG)
-
PR #16 signed-off-by: Samuel Gratzl
License
Seq2Seq-Vis is licensed under Apache 2 license.