• Stars
    star
    136
  • Rank 267,670 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code and Data for ACL 2019 "Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention"

HDSA-Dialog

This is the code and data for ACL 2019 long paper "Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention". The up-to-date version is in http://arxiv.org/abs/1905.12866.

The full architecture is displayed as below:

The architecture consists of two components:

  • Dialog act predictor (Fine-tuned BERT model)
  • Response generator (Hierarchical Disentangled Self-Attention Network)

The basic idea of the paper is to do enable controlled reponse generation under the Transformer framework, where we construct a dialog act graph to represent the semantic space in MultiWOZ tasks. Then we particularly specify different heads in different levels to a specific node in the dialog act graph. For example, the picture above demonstrates the merge of two dialog acts "hotel->inform->location" and "hotel->inform->name". The generated sentence is controlled to deliever message about the name and location of a recommended hotel.

Requirements

Please see the instructions to install the required packages before running experiments.

Folder

  • data: all the needed training/evaluation/testing data
  • transformer: all the baseline and proposed models, which include the hierarchical disentangled self-attention (class TableSemanticDecoder)
  • preprocessing: the code for pre-processing the database and original downloaded data

1. Dialog Act Predictor

This module is used to predict the next-step dialog acts based on the conversation history. Here we adopt the state-of-the-art NLU module BERT to get the best prediction accuracy. Make sure that you install the Pytorch-pretrained-BERT beforehand, which will automatically download pre-trained model into your tmp folder.

Download pre-trained models and the delex.json (it is needed for calculating the inform/request success rate)

sh collect_data.sh

Prepare data (optional, already in the github repo)

python preprocess_data_for_predictor.py

Training (if you use multiple GPU, the batch size can be enlarged)

rm -r checkpoints/predictor/
CUDA_VISIBLE_DEVICES=0 python3.5 train_predictor.py --do_train --do_eval --train_batch_size 6 --eval_batch_size 6

Testing (using the model saved at xxx step)

CUDA_VISIBLE_DEVICES=0 python3.5 train_predictor.py --do_eval --test_set dev --load_dir /tmp/output/save_step_xxx
CUDA_VISIBLE_DEVICES=0 python3.5 train_predictor.py --do_eval --test_set test --load_dir /tmp/output/save_step_xxx

The output values are saved in data/BERT_dev_prediction.json and data/BERT_dev_prediction.json, these two files need to be kept for the generator training.

2. Response Generator

This module is used to control the language generation based on the output of the pre-trained act predictor. The training data is already preprocessed and put in data/ folder (train.json, val.json and test.json).

Training

CUDA_VISIBLE_DEVICES=0 python3.5 train_generator.py --option train --model BERT_dim128_w_domain_exp --batch_size 512 --max_seq_length 50 --field

Delexicalized Testing (The entities are normalzied into placeholder like [restaurant_name])

CUDA_VISIBLE_DEVICES=0 python3.5 train_generator.py --option test --model BERT_dim128_w_domain_exp --batch_size 512 --max_seq_length 50 --field

Non-Delexicalized Testing (The entities need to be restored from the database record)

CUDA_VISIBLE_DEVICES=0 python3.5 train_generator.py --option postprocess --output_file /tmp/results.txt.pred.BERT_dim128_w_domain_exp.pred --model BERT --non_delex

3. Reproducibility

  • We release the pre-trained predictor model in checkpoints/predictor, you can put the zip file into checkpoints/predictor and unzip it to get the save_step_15120 folder.
CUDA_VISIBLE_DEVICES=0 python3.5 train_predictor.py --do_eval --test_set test --load_dir /tmp/output/save_step_15120
  • We already put the pre-trained generator model under checkpoints/generator, you can use this model to obtain 23.6 BLEU on the delexicalized test set.
CUDA_VISIBLE_DEVICES=0 python3.5 train_generator.py --option test --model BERT_dim128_w_domain --batch_size 512 --max_seq_length 50 --field
CUDA_VISIBLE_DEVICES=0 python3.5 train_generator.py --option postprocess --output_file /tmp/results.txt.pred.BERT_dim128_w_domain.pred --model BERT --non_delex

Acknowledgements

We sincerely thank University of Cambridge and PolyAI for releasing the dataset and code

More Repositories

1

Table-Fact-Checking

Data and Code for ICLR2020 Paper "TabFact: A Large-scale Dataset for Table-based Fact Verification"
Python
369
star
2

HybridQA

Dataset and code for EMNLP2020 paper "HybridQA: A Dataset of Multi-Hop Question Answeringover Tabular and Textual Data"
Python
188
star
3

LogicNLG

The data and code for ACL2020 paper "Logical Natural Language Generation from Open-Domain Tables"
Python
163
star
4

Program-of-Thoughts

Data and Code for Program of Thoughts (TMLR 2023)
Python
154
star
5

TheoremQA

The dataset and code for paper: TheoremQA: A Theorem-driven Question Answering dataset
Python
143
star
6

OTT-QA

Code and Data for ICLR2021 Paper "Open Question Answering over Tables and Text"
Python
142
star
7

KGPT

Code and Data for EMNLP2020 Paper "KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation"
Python
142
star
8

Time-Sensitive-QA

Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"
Jupyter Notebook
47
star
9

Variational-Vocabulary-Selection

Code for NAACL19 Paper "How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection"
Python
42
star
10

KB-Reasoning-Data

The FB15k and NELL-995 Dataset for NAACL18 paper "Variational Knowledge Graph Reasoning"
39
star
11

Meta-Module-Network

Code for WACV 2021 Paper "Meta Module Network for Compositional Visual Reasoning"
Python
39
star
12

Cross-Lingual-NBT

Code for EMNLP 2018 paper "XL-NBT: A Cross-lingual Neural Belief Tracking Framework"
Python
36
star
13

Semi-Supervised-Image-Captioning

Code for "bootstrap, review, decode: using out-of-domain textual data to improve image captioning"
Jupyter Notebook
20
star
14

GNN-TabFact

SOTA on TabFact: Graph Neural Network for Table-based Fact Checking
Python
18
star
15

TableCoT

The code and data used for "Large Language Models are few(1)-shot Table Reasoners"
Python
18
star
16

GPT2-Logic2Text

The code for Template-GPT-2 Generation Model for Logic2Text Dataset
Python
18
star
17

WikiTables-WithLinks

Crawled Wikipedia Tables with Passages
Python
11
star
18

ImageEval

Editing Baselines
Jupyter Notebook
4
star
19

Data-to-text-Evaluation-Metric

The metric computation script for different data to text tasks
Python
3
star
20

wenhuchen.github.io

Personal Website
HTML
2
star
21

opendomaintables.github.io

Visualization of Open Domain Tables
HTML
1
star
22

cs486-fall2024-website

Website Page for CS486-fall2024
1
star
23

Scripts

Useful Small Functions to help me deal with different scenarios
Python
1
star
24

WikiTables

The collection of WikiTables
1
star
25

setting_files

Shell
1
star