• Stars
    star
    177
  • Rank 215,985 (Top 5 %)
  • Language
    Python
  • Created about 7 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

attention based joint model for intent detection and slot filling

Joint model for intent detection and slot filling based on attention, input alignment and knowledge.

with ability to detect whether a input sentence is a noise input or meanfuling input by combine feature from domain detection, intent detection and slot filling.

with ability to assign possibility to a input sentence by using language model.

Introduction:

1.intent detection and slot filling joint model which share encoding information

2.incorporate knowledge information with embedding for both intent detection and slot filling. this embedding share the same embedding space with slots output.

3.use bi-direction RNN and CNN to do intent detection

4.use slots middle output as a feature for intent detection to boost performance

5.domain detection is availabile by using CNN, same structure as intent detection. domain is a high level concept which indicates area that intent(s) belongs to.

6.similiarity module is used to detect most similiar training data for any user input

7.toy task: input a sequence of natural number, such as [5,7,2,6,8,3]. for slot filling: count label each number to 0 or 1. if sum of a number together with its previous and next number is great than a threshold(such as 14), we mark it as 1. otherwise 0. in this case, output of slot filling will be:[0,0,1,1,1,0] for intent detection, count how many number totally is marked as 1. in this case, output of intent will be:3.

Performance:

dataset1: |---slot_naive(V6)|---slot_alime(V7)----------|

|------97.9%------|----99.8%%-----------------|

dataset2:

|---intent_tmall|---intent_tmall(similiarity)|---intent_alime|---intent_alime(similiarity)|---TextCNN---|---TextCNN(similiarity)|

|------95.37%|------72.0%-----------------|----93.0%------|----62.9%-----------------|----95.70%-------|----73.5%-------------|

Usage:

1.train the model: train() of xxx_train.py

2.test the model: predict() of xxx_predict.py

  1. for model structure, you can check xxx_model.py

Description for different versions:

V0 (seq2seq version): use TextCNN for intent, use encoder-decoder(seq2seq) model for slots. train() and predict() for toy task is available under a1_joint_intent_slots_model.py


V1 (naive version):

use bi-directional GRU to encode input. this is share between intent detection and slots filling.

intent was predicted directically after fully connected layer based on sum up for different time step.

slots were predicted directically after fully connected layer for each time step.


V2 (simple version): add knowledge to naive version. knowledge is embedding, and used as additional feature to make prediction both for intent and slots.


V3 (p-BOW,TextCNN,similiarity module):

use positional bag of words to encoder input sentence. this is share between intent detection and slots filling.

TextCNN is used for intent detection. knowlege is embedded, transformed and used as feature together with output of TextCNN to make

a prediction.

similiarity module is used to detect the most similiar question for input sentence. it used the representation learned by positional

bag of words. this module is useful when you want to check similiar question or when you want to know the coverage of your dataset;

you can get a prediction by simply use the intent(or called answer) for the most similiar question of the input sentence.


V4(Ali me style TextCNN): word embedding is concated with knowledge embedding to get better representation for each word. 'Hopefully' to capture additional

infomration that is relevant to make prediction.

other part is same as V3


V5(TextCNN):

just to make a comparision with V4 by not using any knowledge.


V6(+domain version)

domain detection is predicted besides intent detection and slot filling.


V7(+context window for slot filling)

mainly change slot filling part: 1.word vector+symbol vector 2.context window 3.nolinear projection 4.bi-directional lstm

for intent and domain detection, use representation from concat of word vector and symbol vector.


V8(intent condition on domain; slot filling condition on intent)

given a domain, intent is limit to a subset of total intents; given a intent, slot name is limited to a subset of total slot names. this model doing this

by providing hidden states of domain together with other features before doing intent detection; it works similiar for slot filling.


alt text

alt text

alt text

Conclude:

Different models can be used for intent detection and slots filling. Some model's performance is strong than others in some dataset, while other model's peformance is better in other dataset. So we need to do experiment using different model to get a better performance.

Reference:

1.Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling,

https://arxiv.org/pdf/1609.01454.pdf

2.阿里AI Labs王刚解读9小时卖出百万台的“天猫精灵” | 高山大学(GASA),

http://www.sohu.com/a/206109679_473283

3.史上最全!阿里智能人机交互的核心技术解析 https://yq.aliyun.com/articles/277907?spm=5176.100244.teamhomeleft.54.SKEyCU

More Repositories

1

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
9,322
star
2

text_classification

all kinds of text classification models and more with deep learning
Python
7,806
star
3

albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Python
3,918
star
4

roberta_zh

RoBERTa中文预训练模型: RoBERTa for Chinese
Python
2,573
star
5

bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Python
960
star
6

sentiment_analysis_fine_grain

Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger
Jupyter Notebook
589
star
7

nlu_sim

all kinds of baseline models for sentence similarity 句子对语义相似度模型
Python
296
star
8

ai_law

all kinds of baseline models for long text classificaiton( text categorization)
Python
276
star
9

xlnet_zh

中文预训练XLNet模型: Pre-Trained Chinese XLNet_Large
Python
228
star
10

multi-label_classification

transform multi-label classification as sentence pair task, with more training data and information
Python
178
star
11

bert_customized

bert with customized features
Python
25
star
12

deep_learning_by_andrew_ng_coursera

deep learning specialization by andrew ng though deeplearning.ai on coursera
HTML
23
star
13

machine_reading_comprehension

machine reading comprehension with deep learning
Python
20
star
14

machine_translation

Machine translation using deep learning with lstm,cnn,attention,beam search and so on.
Python
20
star
15

cs224d_DeepLearningForNLP

Deep Learning for Nature Language Processing at Standford
Python
13
star
16

name_entity_recognition

Name Entity Recognition with DNN
Python
10
star
17

dynamic_pointer_network

an implementation of Pointer Network using tensorflow
Python
9
star
18

machine_learning

machine learning applied to NLP without deep learning
Python
8
star
19

question_answering_with_context

models of question answering with context and it's application
4
star
20

MachineLearningNanoDegreeUdacity

Machine Learning Nano Degree at Udacity.com
HTML
4
star
21

cs229_MachineLearning_AndrewNg_Coursera

Machine Learning from Andrew Ng at coursera.org with Standford
Limbo
3
star
22

bert_original

just fork from bert, add some config files
Python
2
star
23

cs231n_Convolutional-Neural-Networks-for-Visual-Recognition

Convolutional Net for Computer Recognition at Standford: My Own Code
Jupyter Notebook
2
star
24

deep_learning_book_notes

The key points of deep learning book, write as notes.
1
star