• Stars
    star
    203
  • Rank 192,890 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Neural abstractive summarization (seq2seq + copy (or pointer network) + coverage) in pytorch on CNN/Daily Mail

Abstractive Summarization on CNN-DailyMail

Results

Model-1: attention-seq2seq

Model-2: attention-seq2seq + copy

GRU (NLL loss + norm-clip=5):
C ROUGE-1 Average_R: 0.43722 (95%-conf.int. 0.43441 - 0.43994)
C ROUGE-1 Average_P: 0.33340 (95%-conf.int. 0.33090 - 0.33587)
C ROUGE-1 Average_F: 0.36604 (95%-conf.int. 0.36376 - 0.36829)

C ROUGE-2 Average_R: 0.18389 (95%-conf.int. 0.18144 - 0.18637)
C ROUGE-2 Average_P: 0.14111 (95%-conf.int. 0.13902 - 0.14328)
C ROUGE-2 Average_F: 0.15435 (95%-conf.int. 0.15225 - 0.15645)

C ROUGE-L Average_R: 0.39519 (95%-conf.int. 0.39245 - 0.39785)
C ROUGE-L Average_P: 0.30170 (95%-conf.int. 0.29935 - 0.30418)
C ROUGE-L Average_F: 0.33105 (95%-conf.int. 0.32886 - 0.33341)

C ROUGE-SU4 Average_R: 0.19460 (95%-conf.int. 0.19234 - 0.19681)
C ROUGE-SU4 Average_P: 0.14813 (95%-conf.int. 0.14624 - 0.15019)
C ROUGE-SU4 Average_F: 0.16220 (95%-conf.int. 0.16028 - 0.16405)

Model-3: attention-seq2seq + coverage

GRU (NLL loss + norm-clip=5):
C ROUGE-1 Average_R: 0.38197 (95%-conf.int. 0.37955 - 0.38433)
C ROUGE-1 Average_P: 0.36479 (95%-conf.int. 0.36235 - 0.36742)
C ROUGE-1 Average_F: 0.36002 (95%-conf.int. 0.35802 - 0.36230)

C ROUGE-2 Average_R: 0.15487 (95%-conf.int. 0.15277 - 0.15708)
C ROUGE-2 Average_P: 0.14912 (95%-conf.int. 0.14701 - 0.15130)
C ROUGE-2 Average_F: 0.14638 (95%-conf.int. 0.14440 - 0.14846)

C ROUGE-L Average_R: 0.35101 (95%-conf.int. 0.34873 - 0.35333)
C ROUGE-L Average_P: 0.33577 (95%-conf.int. 0.33346 - 0.33824)
C ROUGE-L Average_F: 0.33113 (95%-conf.int. 0.32923 - 0.33335)

C ROUGE-SU4 Average_R: 0.16692 (95%-conf.int. 0.16490 - 0.16894)
C ROUGE-SU4 Average_P: 0.16021 (95%-conf.int. 0.15818 - 0.16222)
C ROUGE-SU4 Average_F: 0.15709 (95%-conf.int. 0.15528 - 0.15892)

Model-4: attention-seq2seq + copy + coverage

GRU (NLL loss + norm-clip=5):
C ROUGE-1 Average_R: 0.44517 (95%-conf.int. 0.44225 - 0.44785)
C ROUGE-1 Average_P: 0.37019 (95%-conf.int. 0.36757 - 0.37286)
C ROUGE-1 Average_F: 0.39081 (95%-conf.int. 0.38862 - 0.39309)

C ROUGE-2 Average_R: 0.19478 (95%-conf.int. 0.19212 - 0.19740)
C ROUGE-2 Average_P: 0.16320 (95%-conf.int. 0.16092 - 0.16562)
C ROUGE-2 Average_F: 0.17147 (95%-conf.int. 0.16920 - 0.17378)

C ROUGE-L Average_R: 0.40901 (95%-conf.int. 0.40618 - 0.41163)
C ROUGE-L Average_P: 0.34055 (95%-conf.int. 0.33796 - 0.34318)
C ROUGE-L Average_F: 0.35930 (95%-conf.int. 0.35708 - 0.36152)

C ROUGE-SU4 Average_R: 0.20330 (95%-conf.int. 0.20083 - 0.20567)
C ROUGE-SU4 Average_P: 0.16916 (95%-conf.int. 0.16704 - 0.17141)
C ROUGE-SU4 Average_F: 0.17787 (95%-conf.int. 0.17579 - 0.18000)
GRU (avg NLL loss + norm-clip=2):
C ROUGE-1 Average_R: 0.46082 (95%-conf.int. 0.45804 - 0.46365)
C ROUGE-1 Average_P: 0.37176 (95%-conf.int. 0.36919 - 0.37447)
C ROUGE-1 Average_F: 0.39686 (95%-conf.int. 0.39461 - 0.39909)

C ROUGE-2 Average_R: 0.20237 (95%-conf.int. 0.19977 - 0.20520)
C ROUGE-2 Average_P: 0.16415 (95%-conf.int. 0.16175 - 0.16654)
C ROUGE-2 Average_F: 0.17448 (95%-conf.int. 0.17225 - 0.17683)

C ROUGE-L Average_R: 0.42083 (95%-conf.int. 0.41817 - 0.42347)
C ROUGE-L Average_P: 0.33970 (95%-conf.int. 0.33722 - 0.34230)
C ROUGE-L Average_F: 0.36250 (95%-conf.int. 0.36024 - 0.36468)

C ROUGE-SU4 Average_R: 0.21030 (95%-conf.int. 0.20801 - 0.21277)
C ROUGE-SU4 Average_P: 0.16956 (95%-conf.int. 0.16745 - 0.17182)
C ROUGE-SU4 Average_F: 0.18028 (95%-conf.int. 0.17835 - 0.18236)
LSTM-v1 (avg NLL loss + norm-clip=2): Pointer-Generator in ACL'17
C ROUGE-1 Average_R: 0.39289 (95%-conf.int. 0.39037 - 0.39537)
C ROUGE-1 Average_P: 0.40210 (95%-conf.int. 0.39939 - 0.40500)
C ROUGE-1 Average_F: 0.38322 (95%-conf.int. 0.38101 - 0.38563)

C ROUGE-2 Average_R: 0.17302 (95%-conf.int. 0.17077 - 0.17537)
C ROUGE-2 Average_P: 0.17902 (95%-conf.int. 0.17642 - 0.18162)
C ROUGE-2 Average_F: 0.16941 (95%-conf.int. 0.16720 - 0.17173)

C ROUGE-L Average_R: 0.36119 (95%-conf.int. 0.35873 - 0.36362)
C ROUGE-L Average_P: 0.37002 (95%-conf.int. 0.36737 - 0.37298)
C ROUGE-L Average_F: 0.35247 (95%-conf.int. 0.35025 - 0.35482)

C ROUGE-SU4 Average_R: 0.17830 (95%-conf.int. 0.17624 - 0.18045)
C ROUGE-SU4 Average_P: 0.18458 (95%-conf.int. 0.18225 - 0.18696)
C ROUGE-SU4 Average_F: 0.17415 (95%-conf.int. 0.17214 - 0.17623)
LSTM-v2 (avg NLL loss + norm-clip=2):
---------------------------------------------
C ROUGE-1 Average_R: 0.43865 (95%-conf.int. 0.43618 - 0.44132)
C ROUGE-1 Average_P: 0.38804 (95%-conf.int. 0.38547 - 0.39081)
C ROUGE-1 Average_F: 0.39701 (95%-conf.int. 0.39489 - 0.39922)

C ROUGE-2 Average_R: 0.19277 (95%-conf.int. 0.19049 - 0.19538)
C ROUGE-2 Average_P: 0.17168 (95%-conf.int. 0.16935 - 0.17413)
C ROUGE-2 Average_F: 0.17480 (95%-conf.int. 0.17261 - 0.17711)

C ROUGE-L Average_R: 0.40145 (95%-conf.int. 0.39890 - 0.40411)
C ROUGE-L Average_P: 0.35558 (95%-conf.int. 0.35308 - 0.35830)
C ROUGE-L Average_F: 0.36359 (95%-conf.int. 0.36142 - 0.36591)

C ROUGE-SU4 Average_R: 0.19977 (95%-conf.int. 0.19763 - 0.20212)
C ROUGE-SU4 Average_P: 0.17737 (95%-conf.int. 0.17521 - 0.17965)
C ROUGE-SU4 Average_F: 0.18040 (95%-conf.int. 0.17845 - 0.18253)

How to run:

More Repositories

1

App-DL

Deep Learning and applications in Startups, CV, NLP
801
star
2

AIStartups

Startups about artificial intelligence. (DM, ML, NLP, CV...)
595
star
3

SongNet

Code for ACL 2020 paper "Rigid Formats Controlled Text Generation":https://www.aclweb.org/anthology/2020.acl-main.68/
Python
226
star
4

Guyu

Chinese GPT2: pre-training and fine-tuning framework for text generation
Python
188
star
5

TranSummar

Transformer for abstractive summarization on cnn/daily-mail and gigawords
Python
139
star
6

JRNN

LSTM and GRU in JAVA
Java
114
star
7

PG_BOW_DEMO

Image Classification using Bag of Words and Spatial Pyramid BoW
C++
110
star
8

hierarchical-encoder-decoder

Hierarchical encoder-decoder framework for sequences of words, sentences, paragraphs and documents using LSTM and GRU in Theano
Python
109
star
9

TtT

code for ACL2021 paper "Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction"
Python
99
star
10

PG_Curve

Matlab code for computing and visualization: Confusion Matrix, Precision/Recall, ROC, Accuracy, F-Measure etc. for Classification.
Objective-C
91
star
11

PG_DEEP

demo of deep belief nets
C++
69
star
12

rnn-theano

RNN(LSTM, GRU) in Theano with mini-batch training; character-level language models in Theano
Python
69
star
13

variational-autoencoder-theano

Variational Autoencoders (VAEs) in Theano for Images and Text
Python
55
star
14

DRGD-LCSTS

code for "Deep Recurrent Generative Decoder for Abstractive Text Summarization"
Python
53
star
15

dialogue-hred-vhred

HRED VHRED VHCR for Multi-Turn Dialogue Systems
Python
44
star
16

lipiji.github.io

HTML
31
star
17

datasets

datasets for NLP research
24
star
18

HFT

code:Hidden factors and hidden topics: Understanding rating dimensions with review text.
Shell
23
star
19

uChecker

Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"
18
star
20

rnn-pytorch

study pytorch
Python
15
star
21

NRT-theano

Code for our SIGIR'2017 paper "Neural Rating Regression with Abstractive Tips Generation for Recommendation"
Python
14
star
22

data-summ-cnn_dailymail

non-anonymized cnn/dailymail dataset for text summarization
Python
12
star
23

JLBFGS

Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) in Java
Java
10
star
24

PG_PageRank

pagerank using mapreduce format
Shell
9
star
25

vae-salience

"Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization"
HCL
9
star
26

PG_PLSA

plsa demo in python
Python
7
star
27

neural-dialogue-s2s-weibo-py3

Python
7
star
28

world2vec

Pre-trained word and phrase vectors
7
star
29

neural-topic-model

neural topic model based on VAE - theano
Python
6
star
30

stopwords

6
star
31

t5_summarization

Python
6
star
32

language_model_transformer

language model via transformer
Python
6
star
33

vae-salience-ramds

"Reader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset"
HCL
6
star
34

gan-bow-text

Generative Adversarial Network (GAN) for text modeling
Python
5
star
35

bert_zh_open200g_wordpiece

Python
5
star
36

jilp

Java ILP is a simplified java interface to (mixed) integer linear programming solvers like, e.g., lp_solve, Glpk, SAT4J (0-1 ILP), CPLEX, or Mosek.
Java
4
star
37

cws-seq2seq

Chinese Word Segment using Seq2Seq Framework.
4
star
38

SwarmRank

Particle Swarm Optimization for Classification and Recommender Systems
MATLAB
4
star
39

PG_ROC_PR_R

ROC and PR curve using R
R
4
star
40

collaborative-topic-regression

C++
4
star
41

textrank_keyword_summary

textrank based keywords extraction and summarization
Python
3
star
42

interpretability-methods

gradients based interpretability methods
Python
3
star
43

tokenizer_zh

Python
3
star
44

TextAdventure

3
star
45

pointer_generator_csc_lstm

Python
3
star
46

PG_LINEAR

L_p-Regularized logistic regression using Gradient Decent (batch)
Shell
3
star
47

TopCJ

Top Conference Timeline
3
star
48

PATG

Code for WWW2019 paper "Persona-Aware Tips Generation"
2
star
49

gan-intro-theano

Generative Adversarial Networks (GAN) example in Theano.
Python
2
star
50

water_level_prediction

Python
2
star
51

S5

2
star
52

adversarial-variational-autoencoders

Adversarial Variational Auto-Encoders (AVAEs) in Theano
Python
2
star
53

TextRefiner

Java
1
star
54

neural-fig2txt

Python
1
star
55

VecComp

Vector completion using sparse coding
Python
1
star
56

instruction_data

Python
1
star
57

corr

Python
1
star
58

pointer_generator_transformer

Python
1
star
59

WebHarvester

Web-Harvest is Open Source Web Data Extraction tool written in Java. This is an extension of the original version.
Java
1
star
60

Finetrain-BERT

Python
1
star
61

big_tpl_zh_10_base

Python
1
star
62

mlp-theano

MLP using Theano
Python
1
star
63

civrealm

CivRealm: A Learning and Reasoning Odyssey for Decision-Making Agents: https://civrealm.github.io/civrealm/
Python
1
star
64

RAG4DocReader

llamaindex testing
Python
1
star