• Stars
    star
    176
  • Rank 216,987 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This is the code for "Chatbot Tutorial" by Siraj Raval on Youtube

Overview

This is the code for this video on Youtube by Siraj Raval on Chatbots for Marketing.

Deep Q&A

Join the chat at https://gitter.im/chatbot-pilots/DeepQA

Table of Contents

Presentation

This work tries to reproduce the results of A Neural Conversational Model (aka the Google chatbot). It uses a RNN (seq2seq model) for sentence predictions. It is done using python and TensorFlow.

The loading corpus part of the program is inspired by the Torch neuralconvo from macournoyer.

For now, DeepQA support the following dialog corpus:

To speedup the training, it's also possible to use pre-trained word embeddings (thanks to Eschnou). More info here.

Installation

The program requires the following dependencies (easy to install using pip: pip3 install -r requirements.txt):

  • python 3.5
  • tensorflow (tested with v1.0)
  • numpy
  • CUDA (for using GPU)
  • nltk (natural language toolkit for tokenized the sentences)
  • tqdm (for the nice progression bars)

You might also need to download additional data to make nltk work.

python3 -m nltk.downloader punkt

The Cornell dataset is already included. For the other datasets, look at the readme files into their respective folders (inside data/).

The web interface requires some additional packages:

  • django (tested with 1.10)
  • channels
  • Redis (see here)
  • asgi_redis (at least 1.0)

A Docker installation is also available. More detailed instructions here.

Running

Chatbot

To train the model, simply run main.py. Once trained, you can test the results with main.py --test (results generated in 'save/model/samples_predictions.txt') or main.py --test interactive (more fun).

Here are some flags which could be useful. For more help and options, use python main.py -h:

  • --modelTag <name>: allow to give a name to the current model to differentiate between them when testing/training.
  • --keepAll: use this flag when training if when testing, you want to see the predictions at different steps (it can be interesting to see the program changes its name and age as the training progress). Warning: It can quickly take a lot of storage space if you don't increase the --saveEvery option.
  • --filterVocab 20 or --vocabularySize 30000: Limit the vocabulary size to and optimize the performances and memory usage. Replace the words used less than 20 times by the <unknown> token and set a maximum vocabulary size.
  • --verbose: when testing, will print the sentences as they are computed.
  • --playDataset: show some dialogue samples from the dataset (can be use conjointly with --createDataset if this is the only action you want to perform).

To visualize the computational graph and the cost with TensorBoard, just run tensorboard --logdir save/.

By default, the network architecture is a standard encoder/decoder with two LSTM layers (hidden size of 256) and an embedding size for the vocabulary of 32. The network is trained using ADAM. The maximum sentence length is set to 10 words, but can be increased.

Web interface

Once trained, it's possible to chat with it using a more user friendly interface. The server will look at the model copied to save/model-server/model.ckpt. The first time you want to use it, you'll need to configure it with:

export CHATBOT_SECRET_KEY="my-secret-key"
cd chatbot_website/
python manage.py makemigrations
python manage.py migrate

Then, to launch the server locally, use the following commands:

cd chatbot_website/
redis-server &  # Launch Redis in background
python manage.py runserver

After launch, the interface should be available on http://localhost:8000/. If you want to deploy the program on a server, use python manage.py runserver 0.0.0.0 instead. More info here.

Chatbot web interface

Results

Surprisingly, it's possible to get some results after only 1 or 2 hours of training (on a GeForce GT 740M), by drastically reducing the input sequence to 5 words and output to 3 (plus the go and eos tokens) and by using a small embedding size (something like 25). Since then I modified the code and now the output length has to match the input one but you can still reproduce the original results, using this version. Of course, the network won't be really chatty:

Q: Hi
A: Hi.

Q: What is your name ?
A: Laura.

Q: What does that mean ?
A: I dunno.

Q: How old are you ?
A: thirty-five.

Q: Will Google hire me ?
A: No.

Q: Tell me the alphabet
A: Fuck you.

Q: That's not nice
A: Yeah.

Here is some case where it fails:

Q: Who is Laura ?
A: My brother.

Q: Say 'goodbye'
A: Alright.

Q: What is cooking ?
A: A channel.

Q: Can you say no ?
A: No.

Q: Two plus two
A: Manny...

With longer sentences, the network is much slower to train. After 250000 iterations with maxLength=10, the network still mainly gives evasive answers (by answering most of the open ended questions by 'I don't know') but it seems to have integrated some basic semantic:

Q: When are the fireworks ?
A: Tomorrow.

Q: What color ?
A: Gray.

Q: Which apple ?
A: That's the first one.

Q: What time is it ?
A: Eleven o'clock.

Q: Who is Skywalker ?
A: She's a freelance bio-exorcist.

Q: Who is John ?
A: The old man.

Q: Who is Jenny ?
A: The man who killed herself.

Q: When were you born ?
A: Soon.

As shown on the last questions, the network still does some confusion between the genders and the present/future. I also tried some deeper philosophical questions with more or less success.

Q: What is the purpose of being intelligent ?
A: The man are obsolete.

Q: What is immoral ?
A: The truth.

Q: What is morality ?
A: The worst.

Q: What is the purpose of existence ?
A: It 's private.

Q: What happens if machines can think ?
A: I think I'll just lie with you.

The model I trained is quite limited by the embedding size I put and by its size, and by the training corpus size. Its 'thought vector' is also probably too small to answer the kind of following:

Q: My name is David. What is my name ?
A: Debbie.

Q: My name is John. What is my name ?
A: Carl Smith.

Q: How many legs does a cat have ?
A: Four.

Q: How many legs does a spider have ?
A: Four.

Q: Do you prefer football or soccer ?
A: Whiskey.

It also seems to overfit as sometimes it will just pop out sentences from its training set unrelated to the question. Adding some dropout should reduce the issue.

Q: I play tennis. What do I play ?
A: I'm going to get laid.

Q: What color is the sky ?
A: One of the deputies is everywhere.

Pretrained model

You can find a pre-trained model here, trained of the default corpus. To use it:

  1. Extract the zip file inside DeepQA/save/
  2. Copy the preprocessed dataset from save/model-pretrainedv2/dataset-cornell-old-lenght10-filter0-vocabSize0.pkl to data/samples/.
  3. Run ./main.py --modelTag pretrainedv2 --test interactive.

Thanks to Nicholas C., here (original) are some additional pre-trained models (compatible with TF 1.2) for diverse datasets. The folder also contains the pre-processed dataset for Cornell, OpenSubtitles, Ubuntu and Scotus (to move inside data/samples/). Those are required is you don't want to process the datasets yourself.

If you have a high-end GPU, don't hesitate to play with the hyper-parameters/corpus to train a better model. From my experiments, it seems that the learning rate and dropout rate have the most impact on the results. Also if you want to share your models, don't hesitate to contact me and I'll add it here.

Improvements

In addition to trying larger/deeper model, there are a lot of small improvements which could be tested. Don't hesitate to send a pull request if you implement one of those. Here are some ideas:

  • For now, the predictions are deterministic (the network just take the most likely output) so when answering a question, the network will always gives the same answer. By adding a sampling mechanism, the network could give more diverse (and maybe more interesting) answers. The easiest way to do that is to sample the next predicted word from the SoftMax probability distribution. By combining that with the loop_function argument of tf.nn.seq2seq.rnn_decoder, it shouldn't be too difficult to add. After that, it should be possible to play with the SoftMax temperature to get more conservative or exotic predictions.
  • Adding attention could potentially improve the predictions, especially for longer sentences. It should be straightforward by replacing embedding_rnn_seq2seq by embedding_attention_seq2seq on model.py.
  • Having more data usually don't hurt. Training on a bigger corpus should be beneficial. Reddit comments dataset seems the biggest for now (and is too big for this program to support it). Another trick to artificially increase the dataset size when creating the corpus could be to split the sentences of each training sample (ex: from the sample Q:Sentence 1. Sentence 2. => A:Sentence X. Sentence Y. we could generate 3 new samples: Q:Sentence 1. Sentence 2. => A:Sentence X., Q:Sentence 2. => A:Sentence X. Sentence Y. and Q:Sentence 2. => A:Sentence X.. Warning: other combinations like Q:Sentence 1. => A:Sentence X. won't work because it would break the transition 2 => X which links the question to the answer)
  • The testing curve should really be monitored as done in my other music generation project. This would greatly help to see the impact of dropout on overfitting. For now it's just done empirically by manually checking the testing prediction at different training steps.
  • For now, the questions are independent from each other. To link questions together, a straightforward way would be to feed all previous questions and answer to the encoder before giving the answer. Some caching could be done on the final encoder stated to avoid recomputing it each time. To improve the accuracy, the network should be retrain on entire dialogues instead of just individual QA. Also when feeding the previous dialogue to the encoder, new tokens <Q> and <A> could be added so the encoder knows when the interlocutor is changing. I'm not sure though that the simple seq2seq model would be sufficient to capture long term dependencies between sentences. Adding a bucket system to group similar input lengths together could greatly improve training speed.

Credits

Credits for this code goes to conchylucultor. I've merely created a wrapper to get people started.

More Repositories

1

Learn_Machine_Learning_in_3_Months

This is the code for "Learn Machine Learning in 3 Months" by Siraj Raval on Youtube
7,557
star
2

learn_math_fast

This is the Curriculum for "How to Learn Mathematics Fast" By Siraj Raval on Youtube
Python
3,183
star
3

Learn_Data_Science_in_3_Months

This is the Curriculum for "Learn Data Science in 3 Months" By Siraj Raval on Youtube
2,701
star
4

Learn_Deep_Learning_in_6_Weeks

This is the Curriculum for "Learn Deep Learning in 6 Weeks" by Siraj Raval on Youtube
2,680
star
5

Learn_Computer_Science_in_5_Months

This is the Curriculum for "Learn Computer Science in 5 Months" By Siraj Raval on Youtube
1,856
star
6

Learn_Blockchain_in_2_months

This is the code for "Learn Blockchain in 2 Months" by Siraj Raval on Youtube
1,725
star
7

YOLO_Object_Detection

This is the code for "YOLO Object Detection" by Siraj Raval on Youtube
Python
1,687
star
8

tensorflow_chatbot

Tensorflow chatbot demo by @Sirajology on Youtube
Python
1,422
star
9

Learn_Computer_Vision

This is the curriculum for "Learn Computer Vision" by Siraj Raval on Youtube
1,078
star
10

Learn-Natural-Language-Processing-Curriculum

This is the curriculum for "Learn Natural Language Processing" by Siraj Raval on Youtube
1,054
star
11

Machine_Learning_Journey

This is the Curriculum for "Machine Learning Journey" By Siraj Raval on Youtube
973
star
12

deepfakes

This is the code for "DeepFakes" by Siraj Raval on Youtube
Python
946
star
13

100_Days_of_ML_Code

These are the instructions for "100 Days of ML Code" By Siraj Raval on Youtube
865
star
14

How-to-Predict-Stock-Prices-Easily-Demo

How to Predict Stock Prices Easily - Intro to Deep Learning #7 by Siraj Raval on Youtube
Jupyter Notebook
764
star
15

Programming_Interview_Study_Plan

This is the Programming Interview Study Plan by Siraj Raval on Youtube
689
star
16

Reinforcement_Learning_for_Stock_Prediction

This is the code for "Reinforcement Learning for Stock Prediction" By Siraj Raval on Youtube
Python
610
star
17

ChatGPT_Trading_Bot

This is the code for the "ChatGPT Trading Bot" Video by Siraj Raval on Youtube
Jupyter Notebook
587
star
18

capsule_networks

This is the code for "Capsule Networks: An Improvement to Convolutional Networks" by Siraj Raval on Youtube
Python
569
star
19

tensorflow_image_classifier

TensorFlow Image Classifier Demo by @Sirajology on Youtube
Python
530
star
20

Make_Money_with_Tensorflow_2.0

This is the code for "Make Money with Tensorflow 2.0" by Siraj Raval
Jupyter Notebook
527
star
21

LearnML

This is the Study Guide for Learn Machine Learning in 3 Months (PyTorch Curriculum) by Siraj Raval on Youtube
520
star
22

Mathematics_for_Beginners

This is the formula sheet for "Mathematics for Beginners" by Siraj Raval on Youtube
484
star
23

How_to_make_a_text_summarizer

This is the code for "How to Make a Text Summarizer - Intro to Deep Learning #10" by Siraj Raval on Youtube
Jupyter Notebook
481
star
24

Pokemon_GAN

This is the code for "Generating Pokemon with a Generative Adversarial Network" by Siraj Raval on Youtube
Python
432
star
25

Watch-Me-Build-a-Trading-Bot

This is the code for "Watch Me Build a Trading Bot" by Siraj Raval on Youtube
JavaScript
419
star
26

Learn_Physics_in_2_Months

This is the curriculum for "Learn Physics in 2 Months" by Siraj Raval on Youtube
414
star
27

How_to_simulate_a_self_driving_car

This is the code for "How to Simulate a Self-Driving Car" by Siraj Raval on Youtube
Python
413
star
28

Your_First_Decentralized_Application

This is the code for "A Guide to Building Your First Decentralized Application" by Siraj Raval on Youtube
Jupyter Notebook
413
star
29

AI_in_Finance

This is the code for "AI in Finance" By Siraj Raval on Youtube
JavaScript
410
star
30

The_Math_of_Intelligence

This is the Syllabus for Siraj Raval's new course "The Math of Intelligence"
394
star
31

tensorflow_speech_recognition_demo

This is the code for 'How to Make a Simple Tensorflow Speech Recognizer' by @Sirajology on Youtube
Python
383
star
32

Convolutional_neural_network

This is the code for "Convolutional Neural Networks - The Math of Intelligence (Week 4)" By Siraj Raval on Youtube
Jupyter Notebook
368
star
33

AI_Startup_Prototype

This is the code for "Watch Me Build an AI Startup" By Siraj Raval on Youtube
Python
364
star
34

Neural_Network_Voices

This is the code for "Neural Network Voices" by Siraj Raval on Youtube
Python
358
star
35

A_Guide_to_Running_Tensorflow_Models_on_Android

This is the code for"A Guide to Running Tensorflow Models on Android" By SIraj Raval on Youtube
Java
344
star
36

pose_estimation

This is the code for "Webcam Tracking with Tensorflow.js" By Siraj Raval on Youtube
TypeScript
318
star
37

Everybody_Dance_Now

This is the code for "Everybody Dance Now!" By Siraj Raval on Youtube
Python
313
star
38

Q-Learning-for-Trading

Python
311
star
39

ethereum_future

This is the Code for "Ethereum Future Prices" by Siraj Raval on Youtube
Jupyter Notebook
299
star
40

AI_Composer

AI Composer for Machine Learning for Hackers #2
Python
294
star
41

How-to-Use-GitHub

This is the supplementary material for "How to Use GitHub" By Siraj Raval on Youtube
294
star
42

how_to_deploy_a_keras_model_to_production

This is the code for the "How to Deploy a Keras Model to Production" by Siraj Raval on Youtube
Python
288
star
43

predicting_stock_prices

This is the coding challenge for "Predicting Stock Prices" by @Sirajology on Youtube
Python
282
star
44

How-to-Deploy-a-Tensorflow-Model-in-Production

This is the code for the "How to Deploy a Tensorflow Model in Production" by Siraj Raval on YouTube
Python
280
star
45

linear_regression_live

This is the code for the "How to Do Linear Regression the Right Way" live session by Siraj Raval on Youtube
Python
269
star
46

Chatbot-AI

Chatbot AI for Machine Learning for Hackers #6
Lua
261
star
47

Predicting_Winning_Teams

This is the code for "Predicting the Winning Team with Machine Learning" by Siraj Raval on Youtube
Jupyter Notebook
249
star
48

Music_Generator_Demo

Music Generator Demo by @Sirajology on Youtube
Python
246
star
49

Data_Science_Interview_Guide

These are the tips for "5 Steps to Pass Data Science Interviews" By Siraj Raval on Youtube
245
star
50

quantum_machine_learning

This is the code for "Quantum Machine Learning" By Siraj Raval on Youtube
HTML
245
star
51

AI_Freelancing

This is the code for "How to Do Freelance AI Programming" By Siraj Raval on Youtube
245
star
52

AI_Artist

AI Artist for Machine Learning for Hackers #5
Python
237
star
53

bitcoin_prediction

This is the code for "Bitcoin Prediction" by Siraj Raval on Youtube
Jupyter Notebook
234
star
54

tensorflow_demo

Tensorflow Demo for my TF in 5 Min Video on Youtube
Python
227
star
55

Neural_Differential_Equations

This is the code for "Neural DIfferential Equations" By Siraj Raval on Youtube
Jupyter Notebook
225
star
56

Stock_Market_Prediction

This is the code for "Stock Market Prediction" by Siraj Raval on Youtube
Jupyter Notebook
218
star
57

Build-an-AI-Startup-with-PyTorch

This is the code for 'Build an AI Startup with Pytorch" by Siraj Raval
Java
217
star
58

ChatGPT_Sports_Betting_Bot

This is the code for "I Built a Sports Betting Bot with ChatGPT" by Siraj Raval on Youtube
Jupyter Notebook
216
star
59

Move_37_Syllabus

This is the syllabus for "Move 37", Siraj Raval's new course at School of AI
215
star
60

Classifying_Data_Using_a_Support_Vector_Machine

This is the code for the "Classifying Data using Gradient Descent" by Siraj Raval on Youtube
Jupyter Notebook
212
star
61

A-Guide-to-DeepMinds-StarCraft-AI-Environment

This is the code for "A Guide to DeepMind's StarCraft AI Environment" by Siraj Raval on Youtube
Python
210
star
62

Landing-a-SpaceX-Falcon-Heavy-Rocket

This is the code for "Landing a SpaceX Falcon Heavy Rocket" By Siraj Raval on Youtube
Python
209
star
63

How_to_Build_a_healthcare_startup

This is the code for "How to Build a Healthcare Startup" by Siraj Raval on Youtube
Dart
203
star
64

AI_For_Music_Composition

This is the code for "AI for Music Composition" by Siraj Raval on Youtube
Python
202
star
65

How_to_make_a_chatbot

This is the code for "How to Make a Chatbot - Intro to Deep Learning #12' by Siraj Raval on YouTube
Python
199
star
66

LSTM_Networks

This is the code for "LSTM Networks - The Math of Intelligence (Week 8)" By Siraj Raval on Youtube
Jupyter Notebook
194
star
67

Make_a_neural_network

This is the code for the "Make a Neural Network" - Intro to Deep Learning #2 by Siraj Raval on Youtube
Python
193
star
68

AI_For_Business_Curriculum

This is the curriculum for the "AI for Business" Course By Siraj Raval on Youtube
189
star
69

Time_Series_Prediction

This is the code for "Time Series Prediction" By Siraj Raval on Youtube
Jupyter Notebook
187
star
70

3D_Pose_Estimation

This is the code for "Machine Vision" By Siraj Raval on Youtube
Python
185
star
71

Financial_Forecasting_with_TensorflowJS

This is the code for "Financial Forecasting with Tensorflow.js" By Siraj Raval on Youtube
JavaScript
184
star
72

linear_regression_demo

This is the code for "How to Make a Prediction - Intro to Deep Learning #1' by Siraj Raval on YouTube
Python
182
star
73

ethereum_demo

This is the code for "Ethereum Explained" by Siraj Raval on Youtube
Jupyter Notebook
178
star
74

Watch-Me-Build-a-Finance-Startup

This is the code for "Watch Me Build a Finance Startup" by Siraj Raval on Youtube
Java
172
star
75

word_vectors_game_of_thrones-LIVE

This is the code for the "How to Make Word Vectors from Game of Thrones (LIVE) " Siraj Raval on Youtube
Jupyter Notebook
170
star
76

AI_in_Medicine_Clinical_Imaging_Classification

This is the code for "AI in Medicine " By Siraj Raval on Youtube
Python
165
star
77

deep_q_learning

This is the Code for "Deep Q Learning - The Math of Intelligence #9" By Siraj Raval on Youtube
Jupyter Notebook
163
star
78

AI_Writer

AI Writer for Machine Learning for Hackers #8
Python
163
star
79

Bitcoin_Trading_Bot

This is the code for "Bitcoin Trading Bot" By Siraj Raval on Youtube
Jupyter Notebook
163
star
80

Unity_ML_Agents

This is the code for "Unity AI" by Siraj Raval on Youtube
Python
161
star
81

how_to_convert_text_to_images

This is the code for "How to Convert Text to Images - Intro to Deep Learning #16' by Siraj Raval on YouTube
Python
155
star
82

Game-AI

Game AI for Machine Learning for Hackers #3
Python
154
star
83

Intro_to_the_Math_of_intelligence

This is the code for "Intro - The Math of Intelligence" by Siraj Raval on Youtube
Python
152
star
84

Sentiment_Analysis

This is the code for "Sentiment Analysis - Data Lit #1" by Siraj Raval on Youtube
Jupyter Notebook
151
star
85

recommender_live

Jupyter Notebook
149
star
86

recurrent_neural_network

This is the code for "Recurrent Neural Networks - The Math of Intelligence (Week 5)" By Siraj Raval on Youtube
Jupyter Notebook
147
star
87

How-to-Learn-from-Little-Data

This is the code for "How to Learn from Little Data - Intro to Deep Learning #17' by Siraj Raval on YouTube
Python
144
star
88

How_to_generate_music_in_tensorflow_LIVE

Python
140
star
89

Kaggle_Earthquake_challenge

This is the code for the Kaggle Earthquake Challenge by Siraj Raval on Youtube
Jupyter Notebook
139
star
90

OpenAI_Five_vs_Dota2_Explained

This is the code for "OpenAI Five vs DOTA 2 Explained" By Siraj Raval on Youtube
Python
138
star
91

Learn_Synthetic_Biology

137
star
92

How-to-Build-a-Biomedical-Startup

This is the code for "How to Build a Biomedical Startup" by Siraj Raval on Youtube
Dart
134
star
93

Gaussian_Mixture_Models

This is the code for "Gaussian Mixture Models - The Math of Intelligence (Week 7)" By Siraj Raval on Youtube
Jupyter Notebook
134
star
94

Make_Money_with_Tensorflow

This is the code for "Make Money with Tensorflow" by Siraj Raval on Youtube
Python
129
star
95

Machine-Learning-API-Tutorial

This is the code for "Machine Learning API tutorial" By Siraj Raval on Youtube
Python
128
star
96

AI_Supply_Chain

This is the code for "AI for Supply Chain" by Siraj Raval on Youtube
Jupyter Notebook
128
star
97

how_to_build_a_bitcoin_startup

This is the code for "How to Build a Bitcoin Startup" by Siraj Raval on Youtube
JavaScript
126
star
98

machine_learning_and_neuroscience

This is the code for "Machine Learning & Neuroscience" By Siraj Raval on Youtube
JavaScript
123
star
99

k_means_clustering

This is the code for "K-Means Clustering - The Math of Intelligence (Week 3)" By SIraj Raval on Youtube
Jupyter Notebook
122
star
100

alphago_demo

This is the code for "How Does DeepMind's AlphaGo Zero Work?" Siraj Raval on Youtube
Python
120
star