• Stars
    star
    290
  • Rank 142,981 (Top 3 %)
  • Language
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

深度学习聊天机器人资源集合 Awesome chatbot resource list

Know a resource that isn't listed below? Feel free to create a new pull request, or open an issue.Seq2seq/chatbot/聊天机器人相关资源收集列表。

在原来的chatbot-links的基础上,添加了一些更多的中文资源而组建的一个List,主要用于个人收集。感兴趣,一起建设完善的资源列表,方便萌新入门,请发送PR或Issue添加更多资源,各层面不限,欢迎添加自己的作品。


Codes

chatbot-rnn

Framework tensorflow

A toy chatbot powered by deep learning and trained on data from Reddit.

> Hi, how are you?
 I'm not the only one here that's having fun with you, it's all me.
> Who else is with you?
 Haha. I'm not the one who made the comment. I'm the one who said that and didn't think that much of anything. 
> Oh, I see. So what's new?
 Nothing. I just thought it was a bit of an overreaction. It's been a while since I've read it. 
> It's all good
 It's all good. It's one of those things that makes me smile :)
> That's great. This is actually my demo chat, so it's nice that you're keeping things positive!
 Thank you :) 

DeepQA

Framework: Tensorflow Demo result:

Hi			            	→ Hi.
What is your name ?			→ Laura.
What does that mean ?		→ I dunno.
How old are you ?			→ thirty-five.
Who is Laura ?				→ My brother.
Say 'goodbye'				→ Alright.
Two plus two				→ manny...

tf_seq2seq_chatbot

Framework: Tensorflow

No answering randomisation is implemented in this code, so the models answers with the same phrase way each time: 未实现回答的是随机化,仅对模型的末尾加入了稀疏层。

hello baby					→ hello
how old are you ?			→ twenty .
i am lonely					→ i am not
nice						→ you ' re not going to be okay .
so rude						→ i ' m sorry .
are you a robot or human?	→ no .
are you better than siri?	→ yes .

machine translation model

Framework: Tensorflow

Google Official seq2seq implementation, attention included. Originlly for translation, can be used as Q/A; Google官方的seq2seq实现,采用了注意力机制(Luong et al., 2015),原本用于语言翻译,也适用于简单Q/A。

Neural-Dialogue-Generation

Framework: Torch 6.x Source code for a bunch of super nice articles by Jiwei Li - one of my favorite researches in the domain of neural conversational systems. Worth checking out.

tell me ... how long have you had this falling sickness ?       → a few months, I guess .
so i had the doctors test sammy ’s response to conditioning .   → so he took the pills .
what are you going to do ?                                      → i ’m going to get you a new car .
they fear your power your intellect .                           → i ’m not afraid of your power .

neuralconvo

Framework: Torch 7

最早的对《A Neural Conversational Model》的实现之一,当时给的Demo效果非常有意思。

Hello?					→ Hi.
How are you?			→ I'm fine.
What's your name?		→ It's hard to describe.
How so?					→ I'm not sure.
What color is the sky?	→ It's blue.
What is your job?		→ It's not that i'm a fucking werewolf!

chatbot-zh-torch7

Framework: Torch 7

从neuralconvo修改而来的,当时应该是第一个中文的RNN based chatbot DEMO。 chatbot-zh-torch7

Seq2seq_Chatbot_QA

Framework: Tensorflow 一个中文的Demo,说明比较详细,推荐。

result:

你到哪里去        →你不是说你不是我的
你喜欢我吗?      →我喜欢你
你吃了吗?        →我还没吃饭呢
你喜欢喝酒吗?     →我不知道
你说话没有逻辑啊   →没有

ChatterBot

Framework: python

看上去比较完整的一个项目,安装简单,说明比较完善,有提供训练的材料,英文为主。

Good morning! How are you doing?   →I am doing very well, thank you for asking.
You're welcome.                    →Do you like hats?

farizrahman4u/seq2seq

nicolas-ivanov/debug_seq2seq

codekansas/keras-language-modeling

Framework: keras

这几个都是seq2seq的实现,但是在Chatbot上面还没有好的demo result。

oswaldoludwig/Seq2seq-Chatbot-for-Keras

A new generative chatbot whose training converges in few epochs, including a model pre-trained on a small but consistent dataset collected from dialogues of English courses online.

这个采用了生成模型的方式,待补充。

    user:  Hello! How are you?
computer:  i am fine , ok , well , thank you  .  

    user:  What's your name?
computer:  janet .   

    user:  What do you like to do now?
computer:  let me think a minute .   

    user:  I want to chat about computer science.
computer:  i am glad to hear there .   

    user:  Are you a real girl or a computer program?
computer:  i am self employed . i would love to be well .   

    user:  Why don't you look for a job?
computer:  i do not know . i am the bank .   

    user:  So, you don't need a job!
computer:  no . i have to make a lot of money .  

pydial

一个多领域多轮会话的开发框架

paper

Corpus

AlJohri/OpenSubtitles

Get a lot of raw movie subtitles (~1.2Gb)

Cornell Movie-Dialogs Corpus

~ 40Mb after clearing out the technical data.

dgk_lost_conv

[中文]语料。大部分为由字幕生成的材料,少量其它对话(如以前的小黄鸡的材料,我从一位网友朋友那里要过来了,感谢他)。 其中results/xiaohuangji50w_fenciA.conv.zip为上面chatbot-zh-torch7的演示的训练材料。

[原射手网的打包字幕合集17G]

现已关闭的射手网有一个所有字幕的合集包,感兴趣的同学需要自行网上搜索下载。

Some English QA Material

这是他人收集的自然语言处理相关数据集,主要包含Question Answering,Dialogue Systems, Goal-Oriented Dialogue Systems三部分,都是英文文本。可以使用机器翻译为中文,供中文对话使用。

TODO

dgk_lost_conv中字幕生成的材料的问题是质量较差,这是因为字幕文件中包含了很多的旁白,或者单人连续说话的情况,而这些在处理的时候都没有剔除掉。希望有同学能够找到方法。 或者 从微博、QQ群、微信群等地方挖掘更多的1v1的对话材料。

其他资源:

Papers

其它:

Github fork

More Repositories

1

dgk_lost_conv

dgk_lost_conv 中文对白语料 chinese conversation corpus
Python
1,064
star
2

openwebmonitor

万能网页监控器,监控物价、订单、出货、外汇、折扣、彩票...无所不能
JavaScript
1,011
star
3

irreader

irreader 万能订阅阅读器,订阅任何网站。
HTML
517
star
4

toutiao-multilevel-text-classfication-dataset

今日头条中文新闻文本(多层)分类数据集
Python
360
star
5

toutiao-text-classfication-dataset

今日头条中文新闻(文本)分类数据集
Python
219
star
6

tensorflow-captcha-practice

请无用于非法用途,请遵守相关法律法规。
Python
87
star
7

colorpad

好用的色彩搭配工具 Color Picker 设计师精选 配色方案
JavaScript
83
star
8

FriggaVision

Caffe DeepID implement with Webface dataset
C++
55
star
9

macnewfile

MacOS Finder new file plugin, supporting file templates 支持自定义的Finder新建文件插件
Swift
45
star
10

captcha-dataset

请勿用于非法用途,请遵守网络安全法。
39
star
11

stockpred

用RNN-LSTM方法预测A股走势
Python
36
star
12

bdtranslate

Baidu translation engine python wrapper 百度翻译python SDK API
Python
15
star
13

cppgl

C++ wrapper for modern OpenGL
C
11
star
14

simwar3

兵棋推演
C++
11
star
15

PolyWorldEditor

A low poly 3d modeling tool.
C#
9
star
16

freebuf

freebuf笔记
Python
9
star
17

AppTimer

APP用量统计,工作小时数统计,掌握全天/周/月工作状态
HTML
7
star
18

spidernest

爬虫之巢
CSS
3
star
19

fosslist

中文的可用的免费开源软件(Free and Open Source Software, FOSS)列表
3
star
20

chatbot-keras

seq2seq chatbot based on Keras
Lua
3
star
21

amlmtool

My ASR acoustic and language model training material preparing tools
Python
2
star
22

self-canceling

自发声体麦克风人声增强
Python
2
star
23

a.f.c-product-site

A.F.C's product site
HTML
2
star
24

label4ml

CSS
2
star
25

tts_corpus_pregen

将混乱的文本,拆分出30字左右的,一系列 句子,并 分词、注音,作为后面阶段 corpus 制造做准备。
Python
2
star
26

ShaderSum

C#
1
star
27

dgk-asr-server

Python
1
star
28

tts_corpus_gen

用在线语音合成来制作一些测试用的 asr corpus
HTML
1
star
29

lua2d

Lua 2d game engine(very early stage)
C
1
star
30

soundTextureGen

compute a sound's frequency spectrum as a image texture, which can be push to shader...
Python
1
star
31

fateleak.github.io

JavaScript
1
star
32

irreader-readmode-editor

irreader readmode rule editor 网空阅读器的阅读模式的规则编辑器
CSS
1
star
33

SecretCube

secret cube
C#
1
star
34

anytrack

a framework to track website update
Python
1
star
35

RUstFiles

common open files for me and you
Python
1
star
36

pylivechat

live chat server and client
HTML
1
star
37

unc_fe_demo

front end code segments
JavaScript
1
star
38

DEBUFF

互联网安全翻车现场报道 | 渗透测试菜市场 | 关注嘿客与画家
HTML
1
star
39

dogejump

the dogejump(dev code) game
C#
1
star
40

deepvalley

谷间电磁炮 iOS/game/unity 3d/shooting/FPS
C#
1
star
41

dianyingxia

目前最火的电影列表(百度,360),搜索直达多个资源(爱奇艺,优酷,时光网,格瓦拉,豆瓣,百度提供的提示)
HTML
1
star