Sequence-to-Sequence Tutorial with Github Issues Data
Code For Medium Article: "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"
Installation
pip install -r requirements.txt
If you are using the AWS Deep Learning Ubuntu AMI, many of the required dependencies will already be installed, so you only need to run:
source activate tensorflow_p36
pip install ktext annoy nltk pydot
See #4 below if you wish to run this tutorial using Docker.
Resources:
-
Tutorial Notebook: The Jupyter notebook that coincides with the Medium post.
-
seq2seq_utils.py: convenience functions that are used in the tutorial notebook to make predictions.
-
ktext: this library is used in the tutorial to clean data. This library can be installed with
pip
. -
Nvidia Docker Container: contains all libraries that are required to run the tutorial. This container is built with Nvidia-Docker v1.0. You can install Nvidia-Docker and run this container like so:
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install nvidia-docker
sudo nvidia-docker run hamelsmu/seq2seq_tutorial
This should work with both Nvidia-Docker v1.0 and v2.0.