• Stars
    star
    127
  • Rank 282,790 (Top 6 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Personalizing Dialogue Agents via Meta-Learning

PAML Personalizing Dialogue Agents via Meta-Learning

License: MIT

This is the PyTorch implementation of the paper: Personalizing Dialogue Agents via Meta-Learning. Zhaojiang Lin*, Andrea Madotto*, Chien-Sheng Wu, Pascale Fung ACL 2019 [PDF]

Zhaojiang Lin and Andrea Madotto contributed equally to this work.

This code has been written using PyTorch >= 0.4.1. If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex is listed below:

@article{lin2019personalizing,
  title={Personalizing Dialogue Agents via Meta-Learning},
  author={Lin, Zhaojiang and Madotto, Andrea and Wu, Chien-Sheng and Fung, Pascale},
  journal={arXiv preprint arXiv:1905.10033},
  year={2019}
}

Abstract

Existing personalized dialogue models use human designed persona descriptions to improve dialogue consistency. Collecting such descriptions from existing dialogues is expensive and requires hand-crafted feature designs. In this paper, we propose to extend Model-Agnostic Meta-Learning (MAML) to personalized dialogue learning without using any persona descriptions. Our model learns to quickly adapt to new personas by leveraging only a few dialogue samples collected from the same user, which is fundamentally different from conditioning the response on the persona descriptions. Empirical results on Persona-chat dataset indicate that our solution outperforms non-meta-learning baselines using automatic evaluation metrics, and in terms of human-evaluated fluency and consistency.

Persona-agnostic meta-learning

The difference between finetuning from a) joint training on all personas and b) meta-learning persona. The solid line represents the optimization path of the initial parameters and dashed line the fine-tuning path. Meta-learned initial parameters can faster adapt to a new persona.

Consistency improvement

Iteration of finetuning versus consistency. Consistency of PAML grows linearly with respect to the iteration.

K-shot (10 iteration) results for different settings. Consistency of PAML grows linearly with respect to finetune dialogue number.

Dependency

Check the packages needed or simply run the command

❱❱❱ pip install -r requirements.txt

Pre-trained glove embedding: glove.6B.300d.txt inside folder /vectors/.

Trained NLI model discribed in paper 3.1: pytorch_model.bin inside folder /data/nli_model/.

Experiment

Training

Training PAML

❱❱❱ python MAML.py --cuda --model trs --batch_size 16 --use_sgd --lr 0.01 --meta_lr 0.0003 --meta_batch_size 16 --meta_optimizer adam --pretrain_emb --weight_sharing --emb_dim 300 --hidden_dim 300 --fix_dialnum_train --pointer_gen --save_path save/paml/

Training baseline without persona input

❱❱❱ python main.py --cuda --model trs --pretrain_emb --weight_sharing --label_smoothing --noam --emb_dim 300 --hidden_dim 300 --pointer_gen --save_path save/no_persona/ 

Training baseline with persona input

❱❱❱ python main.py --cuda --model trs --pretrain_emb --weight_sharing --label_smoothing --noam --emb_dim 300 --hidden_dim 300 --pointer_gen --persona --save_path save/persona/

After training, take the model with lowest PPL in validation set to finetune and test (replace ${model}).

Automatic metric

Finetune and Testing PAML

❱❱❱ python main_fine_tune.py --cuda --model trs --batch_size 16 --use_sgd --lr 0.01 --meta_lr 0.0003 --meta_batch_size 16 --meta_optimizer adam --pretrain_emb --weight_sharing --emb_dim 300 --hidden_dim 300 --pointer_gen --save_path save/paml/${model} --save_path_dataset save/paml/ --test

Finetune and Testing baseline without persona input

❱❱❱ python main_fine_tune.py --cuda --model trs --batch_size 16 --use_sgd --lr 0.01 --meta_lr 0.0003 --meta_batch_size 16 --meta_optimizer adam --pretrain_emb --weight_sharing --emb_dim 300 --hidden_dim 300 --pointer_gen --save_path save/no_persona/${model} --save_path_dataset save/no_persona/ --test

Finetune and Testing baseline with persona input

❱❱❱ python main_fine_tune.py --cuda --model trs --batch_size 16 --use_sgd --lr 0.01 --meta_lr 0.0003 --meta_batch_size 16 --meta_optimizer adam --pretrain_emb --weight_sharing --emb_dim 300 --hidden_dim 300 --pointer_gen --persona --save_path save/persona/${model} --save_path_dataset save/persona/ --test

Generation samples

To check generation of PAML:

❱❱❱ python generate_samples.py --cuda --model trs --batch_size 1 --use_sgd --lr 0.01 --meta_lr 0.0003 --meta_batch_size 16 --meta_optimizer adam --pretrain_emb --weight_sharing --emb_dim 300 --hidden_dim 300 --pointer_gen --save_path save/paml/${model} --save_path_dataset save/paml/ --test

To skip training and finetune, check generation here: paml_generation

More Repositories

1

Mem2Seq

Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems
Python
353
star
2

chatgpt-evaluation

This respository contains the code for extracting the test samples we used in our paper: "A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity"
Python
76
star
3

cantonese-asr

Python
72
star
4

MoEL

MoEL: Mixture of Empathetic Listeners
Python
72
star
5

Xpersona

XPersona: Evaluating Multilingual Personalized Chatbot
Python
67
star
6

VG-GPLMs

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".
Python
54
star
7

ke-dialogue

KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.
Python
45
star
8

dialogue-emotion

Hierarchical Attention for Dialogue Emotion Classification (SemEval, NAACL)
Python
42
star
9

CI-AVSR

Code repository for the Cantonese In-car Audio-Visual Speech Recognition (CI-AVSR) dataset.
Python
37
star
10

adapterbot

The Adapter-Bot: All-In-One Controllable Conversational Model
Python
36
star
11

MulQG

Multi-hop Question Generation with Graph Convolutional Network
Python
28
star
12

CAiRE-COVID

A machine learning-based system that uses state-of-the-art natural language processing (NLP) question answering (QA) techniques combined with summarization for mining the available scientific literature
Python
26
star
13

sensational_headline

This is the repo for sensational headline generation of our published paper in EMNLP 2019
Python
25
star
14

BiToD

A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling
Python
23
star
15

ASCEND

ASCEND Chinese-English code-switching dataset
Jupyter Notebook
21
star
16

sentiment-lookahead

Original code for our work on Sentiment Look-ahead.
Python
18
star
17

KnowExpert

The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".
Python
17
star
18

eigenvector-analysis

Code for "Interpreting Word Embeddings with Eigenvector Analysis" https://openreview.net/forum?id=rJfJiR5ooX.
Jupyter Notebook
16
star
19

snp2vec

Code and dataset for SNP2Vec paper
Jupyter Notebook
15
star
20

UniVaR

Official reposity for paper "High-Dimension Human Value Representation in Large Language Models"
Python
15
star
21

hyperpartisan-news-detection

SemEval 2019 Task 4: Hyperpartisan News Detection
Python
14
star
22

elderly_ser

Transferability of cross-lingual and cross-age speech emotion recognition
Jupyter Notebook
13
star
23

Perplexity-FactChecking

Towards Few-Shot Fact-Checking via Perplexity
Python
13
star
24

cqr4cqa

Python
13
star
25

CAiRE_in_DialDoc21

CAiRE in DialDoc21: Data Augmentation for Information-SeekingDialogue System
Python
11
star
26

QFS

Python
9
star
27

covid19-misinfo-data

9
star
28

HLTC-MRQA

Generalizing Question Answering System with Pre-trained Language Model Fine-tuning.
Python
7
star
29

framing-bias-metric

Python
7
star
30

InstructAlign

Python
5
star
31

UnifiedM2

Jupyter Notebook
4
star
32

long-biomedical-model

How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
Python
4
star
33

chatbot-political-prudence-test

Chatbot Political Prudence Test Sets
2
star
34

contrastive_inference_dialogue

Contrastive Learning for Inference in Dialogue
Python
2
star