• Stars
    star
    100
  • Rank 328,782 (Top 7 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 4 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

DSTC9 Track 1 - Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access

License

DSTC9 Track 1 - Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access

This repository contains the data, scripts and baseline codes for DSTC9 Track 1.

This challenge track aims to support frictionless task-oriented conversations, where the dialogue flow does not break when users have requests that are out of the scope of APIs/DB but potentially are already available in external knowledge sources. Track participants will develop dialogue systems to understand relevant domain knowledge, and generate system responses with the relevant selected knowledge.

Organizers: Seokhwan Kim, Mihail Eric, Behnam Hedayatnia, Karthik Gopalakrishnan, Yang Liu, Chao-Wei Huang, Dilek Hakkani-Tur

News

  • March 30, 2022 - The post-challenge leaderboard is online here.
  • January 26, 2021 - The human evaluation scores for each finalist entry are released at results/.
  • November 4, 2020 - The system outputs submitted by the participants are released at results/.
  • October 19, 2020 - The human evaluation results are now available: See Results.
  • October 12, 2020 - The objective evaluation results are now available: See Results.
  • October 12, 2020 - The ground-truth labels/responses for the evaluation data are released at data_eval/test/labels.json.
  • September 21, 2020 - The evaluation data is released. Please find the details from data_eval/.
  • August 18, 2020 - Patched data released with labeling error fixes. Please update your local branch.

Important Links

If you want to publish experimental results with this dataset or use the baseline models, please cite the following article:

@article{kim2020domain,
  title={Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access},
  author={Seokhwan Kim and Mihail Eric and Karthik Gopalakrishnan and Behnam Hedayatnia and Yang Liu and Dilek Hakkani-Tur},
  journal={arXiv preprint arXiv:2006.03533}
  year={2020}
}

NOTE: This paper reports the results with an earlier version of the dataset and the baseline models, which will differ from the baseline performances on the official challenge resources.

Tasks

This challenge track decouples between turns that could be handled by the existing task-oriented conversational models with no extra knowledge and turns that require external knowledge resources to be answered by the dialogue system. We focus on the turns that require knowledge access as the evaluation target in this track by the following three tasks:

Task #1 Knowledge-seeking Turn Detection
Goal To decide whether to continue the existing scenario or trigger the knowledge access branch for a given utterance and dialogue history
Input Current user utterance, Dialogue context, Knowledge snippets
Output Binary class (requires knowledge access or not)
Task #2 Knowledge Selection
Goal To select proper knowledge sources from the domain knowledge-base given a dialogue state at each turn with knowledge access
Input Current user utterance, Dialogue context, Knowledge snippets
Output Ranked list of top-k knowledge candidates
Task #3 Knowledge-grounded Response Generation
Goal To take a triple of input utterance, dialog context, and the selected knowledge snippets and generate a system response
Input Current user utterance, Dialogue context, and Selected knowledge snippets
Output Generated system response

Participants will develop systems to generate the outputs for each task. They can leverage the annotations and the ground-truth responses available in the training and validation datasets.

In the test phase, participants will be given a set of unlabeled test instances. And they will submit up to 5 system outputs for all three tasks.

NOTE: For someone who are interested in only one or two of the tasks, we recommend to use our baseline system for the remaining tasks to complete the system outputs.

Evaluation

Each submission will be evaluated in the following task-specific automated metrics first:

Task Automated Metrics
Knowledge-seeking Turn Detection Precision/Recall/F-measure
Knowledge Selection Recall@1, Recall@5, MRR@5
Knowledge-grounded Response Generation BLEU, ROUGE, METEOR

To consider the dependencies between the tasks, the scores for knowledge selection and knowledge-grounded response generation are weighted by knowledge-seeking turn detection performances. Please find more details from scores.py.

The final ranking will be based on human evaluation results only for selected systems according to automated evaluation scores. It will address the following aspects: grammatical/semantical correctness, naturalness, appropriateness, informativeness and relevance to given knowledge.

Data

In this challenge track, participants will use an augmented version of MultiWoz 2.1 which includes newly introduced knowledge-seeking turns. All the ground-truth annotations for Knowledge-seeking Turn Detection and Knowledge Selection tasks as well as the agent's responses for Knowledge-grounded Response Generation task are available to develop the components on the training and validation sets. In addition, relevant knowledge snippets for each domain or entity are also provided in knowledge.json.

In the test phase, participants will be evaluated on the results generated by their models for two data sets: one is the unlabeled test set of the augmented MultiWoz 2.1, and the other is a new set of unseen conversations which are collected from scratch also including turns that require knowledge access. To evaluate the generalizability and the portability of each model, the unseen test set will be collected on different domains, entities and locales than MultiWoz.

Data and system output format details can be found from data/README.md.

Timeline

  • Training data released: Jun 15, 2020
  • Test data released: Sep 21, 2020
  • Entry submission deadline: Sep 28, 2020
  • Objective evaluation completed: Oct 12, 2020
  • Human evaluation completed: Oct 19, 2020

Rules

  • Participation is welcome from any team (academic, corporate, non profit, government).
  • The identity of participants will NOT be published or made public. In written results, teams will be identified as team IDs (e.g. team1, team2, etc). The organizers will verbally indicate the identities of all teams at the workshop chosen for communicating results.
  • Participants may identify their own team label (e.g. team5), in publications or presentations, if they desire, but may not identify the identities of other teams.
  • Participants are allowed to use any external datasets, resources or pre-trained models.

Contact

Join the DSTC mailing list to get the latest updates about DSTC9

For specific enquiries about DSTC9 Track1

Please feel free to contact: seokhwk (at) amazon (dot) com

More Repositories

1

alexa-skills-kit-sdk-for-nodejs

The Alexa Skills Kit SDK for Node.js helps you get a skill up and running quickly, letting you focus on skill logic instead of boilerplate code.
TypeScript
3,106
star
2

alexa-cookbook

A series of sample code projects to be used for educational purposes during Alexa hackathons and workshops, and as a reference for tutorials and blog posts.
JavaScript
1,845
star
3

avs-device-sdk

An SDK for commercial device makers to integrate Alexa directly into connected products.
C++
1,250
star
4

alexa-skills-kit-sdk-for-java

The Alexa Skills Kit SDK for Java helps you get a skill up and running quickly, letting you focus on skill logic instead of boilerplate code.
Java
811
star
5

alexa-skills-kit-sdk-for-python

The Alexa Skills Kit SDK for Python helps you get a skill up and running quickly, letting you focus on skill logic instead of boilerplate code.
Python
795
star
6

Topical-Chat

A dataset containing human-human knowledge-grounded open-domain conversations.
Python
588
star
7

massive

Tools and Modeling Code for the MASSIVE dataset
Python
527
star
8

bort

Repository for the paper "Optimal Subarchitecture Extraction for BERT"
Python
472
star
9

alexa-auto-sdk

The Alexa Auto SDK is for automotive OEMs to integrate Alexa directly into vehicles.
C++
288
star
10

dialoglue

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue
Python
275
star
11

ask-cli

Alexa Skills Kit Command Line Interface
JavaScript
154
star
12

teach

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.
Python
124
star
13

alexa-apis-for-python

The Alexa APIs for Python consists of python classes that represent the request and response JSON of Alexa services. These models act as core dependency for the Alexa Skills Kit Python SDK (https://github.com/alexa/alexa-skills-kit-sdk-for-python).
Python
112
star
14

ask-toolkit-for-vscode

ASK Toolkit is an extension for Visual Studio Code (VSC) that that makes it easier for developers to develop and deploy Alexa Skills.
TypeScript
104
star
15

alexa-dataset-contextual-query-rewrite

This repo includes extensions to the Stanford Dialogue Corpus. It contains crowd-sourced rewrites to facilitate research in dialogue state tracking using natural language as the interface.
83
star
16

alexa-smart-screen-sdk

⛔️ DEPRECATED Active at https://github.com/alexa/avs-device-sdk
75
star
17

Commonsense-Dialogues

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.
74
star
18

alexa-apis-for-nodejs

The Alexa APIs for NodeJS consists of JS and Typescript definitions that represent the request and response JSON of Alexa services. These models act as core dependency for the Alexa Skills Kit NodeJS SDK (https://github.com/alexa/alexa-skills-kit-sdk-for-nodejs).
TypeScript
61
star
19

alexa-with-dstc10-track2-dataset

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations
Python
59
star
20

alexa-for-business

This repository holds sample Alexa skill templates for use in enterprise scenarios and in particular for use with Alexa for Business (aws.amazon.com/a4b). Some samples are more complete, such as the Help Desk skill, but others will be smaller in scope, focusing on specific use cases or integrations.
JavaScript
43
star
21

dstc11-track5

DSTC11 Track 5 - Task-oriented Conversational Modeling with Subjective Knowledge
Python
40
star
22

apl-core-library

APL Core Library enables device makers to create their own "APL viewhost", bringing Alexa experiences with visual renderings to new devices or platforms using any programming language that can invoke C/C++ code.
C++
35
star
23

ask-sdk-controls

The ASK SDK Controls framework builds on the ASK SDK for Node.js, offering a scalable solution for creating large, multi-turn skills in code with reusable components called controls.
TypeScript
34
star
24

dstqa

Code for Li Zhou, Kevin Small. Multi-domain Dialogue State Tracking as Dynamic Knowledge Graph Enhanced Question Answering. In NeurIPS 2019 Workshop on Conversational AI
Python
28
star
25

alexa-apis-for-java

The Alexa APIs for Java consists of JAVA POJO classes that represent the request and response JSON of Alexa services. These models act as core dependency for the Alexa Skills Kit Java SDK (https://github.com/alexa/alexa-skills-kit-sdk-for-java ).
Java
28
star
26

kilm

Python
21
star
27

alexa-end-to-end-slu

This setup allows to train end-to-end neural models for spoken language understanding (SLU).
Python
20
star
28

AIAClientSDK

Device SDK for products that use Alexa Voice Service (AVS) Integration for AWS IoT written in C99. For more information, visit https://docs.aws.amazon.com/iot/latest/developerguide/avs-integration-aws-iot.html
C
19
star
29

apl-viewhost-web

TypeScript
18
star
30

ramen

A software for transferring pre-trained English models to foreign languages
Python
17
star
31

max-toolkit

The MAX Toolkit provides software which aims to accelerate the development of devices which integrate multiple voice agents. The Toolkit provides guidance to both device makers and agent developers towards this goal.
C++
11
star
32

apl-client-library

C++
10
star
33

places

This is the code for our paper: PLACES: Prompting Language Models for Social Conversation Synthesis
Python
10
star
34

apl-suggester

TypeScript
9
star
35

schema-guided-nlg

This repository provides the dataset used in "Schema-Guided Natural Language Generation" by Yuheng Du, Shereen Oraby, Vittorio Perera, Minmin Shen, Anjali Narayan-Chen, Tagyoung Chung, Anu Venkatesh, and Dilek Hakkani-Tur.
9
star
36

visitron

VISITRON: A multi-modal Transformer-based model for Cooperative Vision-and-Dialog Navigation (CVDN)
Python
9
star
37

apl-viewhost-android

C++
9
star
38

xlgen-eacl-2023

Python
9
star
39

factual-consistency-analysis-of-dialogs

A human annotated dataset that determines if neural generated responses are factually inconsistent with a knowledge snippet.
9
star
40

gravl-bert

pytorch implementation for GraVL-BERT paper
Python
8
star
41

skill-components

Public repository for Alexa Conversations Description Language (ACDL) Reusable components
TypeScript
7
star
42

wow-plus-plus

WOW++ is a knowledge-grounded dataset containing multiple relevant knowledge sentences for the last turn within a dialog
7
star
43

amazon-pay-alexa-utils-for-nodejs

TypeScript
6
star
44

alexa-dataset-redtab

5
star
45

alexa-point-of-view-dataset

Point of View (POV) conversion dataset. Messages spoken to virtual assistants are converted from sender perspective to virtual assistant's perspective for delivery.
HTML
5
star
46

alexa-smart-screen-web-components

A node.js framework for commercial smart screen device makers to integrate Alexa multi-modal features into their products.
TypeScript
5
star
47

conture

ConTurE is a human-chatbot dataset that contains turn level annotations to assess the quality of chatbot responses.
4
star
48

amazon-voice-conversion-voicy

This repository contains audio samples from the paper “Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments”
HTML
4
star
49

apl-translator-lottie

TypeScript
3
star
50

unreliable-news-detection-biases

Python
3
star
51

alexa-conversations-reusable-dialogs

2
star
52

alexa-with-dstc9-track1-new-model

Python
1
star
53

avs-sdk-oobe-screens-demo

Demo for Alexa Voice Service OOBE flow for screen-based devices. To be used with the AVS Smart Screen SDK.
JavaScript
1
star