• This repository has been archived on 27/Feb/2024
  • Stars
    star
    132
  • Rank 265,630 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for the paper 'Personalization in Goal-oriented Dialog' (NeurIPS 2017 Conversational AI Workshop)

Personalization in Goal-Oriented Dialog

This repository contains code for the paper "Personalization in Goal-Oriented Dialog" by Chaitanya Joshi, Fei Mi and Boi Faltings. We introduce a set of 5 tasks for testing end-to-end dialog systems in a goal-oriented setting with a focus on personalization of conversation. We also provide baselines using various models and publicize our implementations and experimental results through this repository.

Check out the accompanying blog post here.

Update: The paper was published at the NIPS 2017 Workshop on Conversational AI!

Dataset

The Personalized Dialog dataset can be downloaded using build_data.py. Alternatively, it is accessable using this link or through the ParlAI framework for dialog AI research.

Each of the tasks can also be generated from the bAbI Dialog tasks using the files in the scripts/ directory.

Usage

Please refer to the README file inside each model directory for exact installation and usage instructions. In general, we used Python 3.6 and TensorFlow 0.12.1 for our experiments. (Apologies for not providing the exact package versions of other libraries!)

Dialog Format

We provide two variants of the dataset- a full set with 6000/12000 dialogs each for training, development and testing; and a small set restricted to 1000 dialogs each for training, development and testing. We also include a second test set (with suffix -OOV.txt) that contains dialogs including entities not present in training and development sets.

The file format for each task is as follows:

1 user_attributes
ID user_utterance [tab] bot_utterances
...

The IDs for a given dialog start at 1 and increase. When the IDs in a file reset back to 1 you can consider the following sentences as a new dialog. The line starting with ID 1 contains attributes associated with a user profile and subsequent lines contain various turns forming the dialog. When the bot speaks two times in a row, we used the special token "" to fill in for the missing user utterance.

For example (for Task 1):

1 male elderly
2 hello	greetings sir what may i assist you with today
3 may i have a table in a moderate price range with italian food for eight	excellent sir i will start the request now
4 <SILENCE>	may i know where the restaurant should be located
5 bombay please	excellent sir please give me a moment to provide you with options
6 <SILENCE>	api_call italian bombay eight moderate

The goal of the tasks is to predict the bot utterances, that can be sentences or API calls (sentences starting with the special token "api_call").

Sample Dialogs

Along with the train, dev and test sets, we also include a knowledge base file (personalized-dialog-kb-all.txt) that contain all entities appearing in dialogs for tasks 1-5. We also include a file containing the candidates to select the answer from (personalized-dialog-candidates.txt) for tasks 1-5, that is simply made of all the bot utterances in train, dev, test for these tasks.

In addition to the small and full datasets, we also provide a split-by-profile dataset where each directory contains 1000 dialogs each for training, development and testing for a specific user profile. This set can be used to analyze multi-task learning capabilities of models.

Models

We provide our implementations of three models- Supervised Embeddings (supervised-embedding/), Memory Networks (MemN2N/) and Memory Networks with split-memory architecure (MemN2N-split-memory/). Each directory contains scripts, experimental logs and model checkpoints. Instructions on using a models are given in its README.

License

The dataset is released under Creative Commons Attribution 3.0 Unported license. A copy of this license is included with the data.

References

  • Antoine Bordes, Y-Lan Boureau, Jason Weston, "Learning End-to-End Goal-Oriented Dialog", arXiv:1605.07683 [cs.CL].
  • Chaitanya K. Joshi, Fei Mi, Boi Faltings, "Personalization in Goal-Oriented Dialog", arXiv:1706.07503 [cs.CL].

More Repositories

1

efficient-gnns

Code and resources on scalable and efficient Graph Neural Networks
Python
525
star
2

geometric-gnn-dojo

Geometric GNN Dojo provides unified implementations and experiments to explore the design space of Geometric Graph Neural Networks.
Jupyter Notebook
415
star
3

graph-convnet-tsp

Code for the paper 'An Efficient Graph Convolutional Network Technique for the Travelling Salesman Problem' (INFORMS Annual Meeting Session 2019)
Python
278
star
4

learning-tsp

Code for the paper 'Learning TSP Requires Rethinking Generalization' (CP 2021)
Jupyter Notebook
188
star
5

geometric-rna-design

gRNAde: Geometric Deep Learning for RNA Design
Jupyter Notebook
110
star
6

markowitz-portfolio-optimization

Markowitz portfolio optimization on synthetic and real stocks
Python
72
star
7

lstm-context-embeddings

Augmenting word embeddings with their surrounding context using bidirectional RNN
Python
60
star
8

regression-stock-prediction

Predicting Google’s stock price using regression
Python
58
star
9

gated-graph-transformers

Transformers are Graph Neural Networks!
Python
49
star
10

learning-paradigms-for-tsp

Code for the paper 'On Learning Paradigms for the Travelling Salesman Problem' (NeurIPS 2019 Graph Representation Learning Workshop)
Python
27
star
11

auto-mate-for-tinder

Use Artificial Intelligence to find promiscuous Tinder matches
CSS
23
star
12

knowledge-graphs

Building Knowledge Graphs from Unstructured Text
Jupyter Notebook
20
star
13

structured-self-attention

Keras implementation of the Structured Self-Attentive Sentence Embedding model
Python
19
star
14

flask-mongodb

A simple REST Api using Flask-Restful and MongoDB
Python
17
star
15

working-women

Code for the paper 'Working Women and Caste in India' (ICLR 2019 AI for Social Good Workshop)
Jupyter Notebook
14
star
16

music-library-ocd-fixer

Automatically fetch metadata for your music collection and rename files accordingly
Python
13
star
17

mnist-cnn-autoencoder

Using deep CNNs and stacked autoencoders to classify images of digits from the MNIST dataset
Python
7
star
18

Perceptron

Perceptron algorithm implemented from scratch
Python
5
star
19

nn-classification-and-regression

Using deep neural networks for classification and regression problems
Python
5
star
20

NTUOSS-MachineLearningWorkshop

Introductory Machine Learning workshop for NTU Open Source Society
Python
5
star
21

transformers-are-gnns

4
star
22

Velocity

A 2D lane-switching game made using the graphics.h C++ library
C++
1
star
23

PreparedForU

A simple tool for university aspirants and students in Singapore to estimate and visualize their spending based on publicly available data
Vue
1
star