• This repository has been archived on 13/Jan/2022
  • Stars
    star
    910
  • Rank 48,213 (Top 1.0 %)
  • Language
    Lua
  • License
    Other
  • Created over 8 years ago
  • Updated about 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Neural Attention Model for Abstractive Summarization

Attention-Based Summarization

This project contains the Abs. neural abstractive summarization system from the paper

 A Neural Attention Model for Abstractive Summarization.
 Alexander M. Rush, Sumit Chopra, Jason Weston.

The release includes code for:

  • Extracting the summarization data set
  • Training the neural summarization model
  • Constructing evaluation sets with ROUGE
  • Tuning extractive features

Setup

To run the system, you will need to have Torch7 installed. You will also need Python 2.7, NLTK, and GNU Parallel to run the data processing scripts. Additionally the code currently requires a CUDA GPU for training and decoding.

Finally the scripts require that you set the $ABS environment variable.

> export ABS=$PWD
> export LUA_PATH="$LUA_PATH;$ABS/?.lua"

Constructing the Data Set

The model is trained to perform title generation from the first line of newspaper articles. Since the system is completely data-driven it requires a large set of aligned input-title pairs for training.

To provide these pairs we use the Annotated Gigaword corpus as our main data set. The corpus is available on LDC, but it requires membership. Once the annotated gigaword is obtained, you can simply run the provided script to extract the data set in text format.

Generating the data

To construct the data set run the following script to produce working_dir/, where `working_dir/' is the path to the directory where you want to store the processed data. The script 'construct_data.sh' makes use of the 'parallel' utility, so please make sure that it is in your path. WARNING: This may take a couple hours to run.

 > ./construct_data.sh agiga/ working_dir/

Format of the data files

The above command builds aligned files of the form split.type.txt where split is train/valid/test and type is title/article.

The output of the script is several aligned plain-text files. Each has one title or article per line.

 > head train.title.txt
 australian current account deficit narrows sharply
 at least two dead in southern philippines blast
 australian stocks close down #.# percent
 envoy urges north korea to restart nuclear disablement
 skorea announces tax cuts to stimulate economy

These files can be used to train the ABS system or be used by other baseline models.

Training the Model

Once the data set has been constructed, we provide a simple script to train the model.

./train_model.sh working_dir/ model.th

The training process consists of two stages. First we convert the text files into generic input-title matrices and then we train a conditional NNLM on this representation.

Once the model has been fully trained (this may require 3-4 days), you can use the test script to produce summaries of any plain text file.w

./test_model.sh working_dir/valid.article.filter.txt model.th length_of_summary

Training options

These scripts utilize the Torch code available in $ABS/summary/

There are two main torch entry points. One for training the model from data matrices and the other for evaluating the model on plain-text.

 > th summary/train.lua -help

 Train a summarization model.

   -articleDir      Directory containing article training matrices. []
   -titleDir        Directory containing title training matrices. []
   -validArticleDir Directory containing article matricess for validation. []
   -validTitleDir   Directory containing title matrices for validation. []
   -auxModel        The encoder model to use. [bow]
   -bowDim          Article embedding size. [50]
   -attenPool       Attention model pooling size. [5]
   -hiddenUnits     Conv net encoder hidden units. [1000]
   -kernelWidth     Conv net encoder kernel width. [5]
   -epochs          Number of epochs to train. [5]
   -miniBatchSize   Size of training minibatch. [64]
   -printEvery      How often to print during training. [1000]
   -modelFilename   File for saving loading/model. []
   -window          Size of NNLM window. [5]
   -embeddingDim    Size of NNLM embeddings. [50]
   -hiddenSize      Size of NNLM hidden layer. [100]
   -learningRate    SGD learning rate. [0.1]

Testing options

The run script is used for beam-search decoding with a trained model. See the paper for a description of the extractive features used at decoding time.

> th summary/run.lua -help

-blockRepeatWords Disallow generating a repeated word. [false]
-allowUNK         Allow generating <unk>. [false]
-fixedLength      Produce exactly -length words. [true]
-lmWeight         Weight for main model. [1]
-beamSize         Size of the beam. [100]
-extractive       Force fully extractive summary. [false]
-lmWeight         Feature weight for the neural model. [1]
-unigramBonus     Feature weight for unigram extraction. [0]
-bigramBonus      Feature weight for bigram extraction. [0]
-trigramBonus     Feature weight for trigram extraction. [0]
-lengthBonus      Feature weight for length. [0]
-unorderBonus     Feature weight for out-of-order extraction. [0]
-modelFilename    Model to test. []
-inputf           Input article files.  []
-nbest            Write out the nbest list in ZMert format. [false]
-length           Maximum length of summary.. [5]

Evaluation Data Sets

We evaluate the ABS model using the shared task from the Document Understanding Conference (DUC).

This release also includes code for interactive with the DUC shared task on headline generation. The scripts for processing and evaluating on this data set are in the DUC/ directory.

The DUC data set is available online, unfortunately you must manually fill out a form to request the data from NIST. Send the request to Angela Ellis.

Processing DUC

After receiving credentials you should obtain a series of tar files containing the data used as part of this shared task.

  1. Make a directory DUC_data/ which should contain the given files

    >DUC2003\_Summarization\_Documents.tgz
    >DUC2004\_Summarization\_Documents.tgz
    >duc2004\_results.tgz
    >detagged.duc2003.abstracts.tar.gz
    
  2. Run the setup script (this requires python and NLTK for tokenization)

    ./DUC/setup.sh DUC_data/

After running the scripts there should be directories

   DUC_data/clean_2003/
   DUC_data/clean_2004/

Each contains a file input.txt where each line is a tokenized first line of an article.

 > head DUC_data/clean_2003/input.txt
 schizophrenia patients whose medication could n't stop the imaginary voices in their heads gained some relief after researchers repeatedly sent a magnetic field into a small area of their brains .
 scientists trying to fathom the mystery of schizophrenia say they have found the strongest evidence to date that the disabling psychiatric disorder is caused by gene abnormalities , according to a researcher at two state universities .
 a yale school of medicine study is expanding upon what scientists know  about the link between schizophrenia and nicotine addiction .
 exploring chaos in a search for order , scientists who study the reality-shattering mental disease schizophrenia are becoming fascinated by the chemical environment of areas of the brain where perception is regulated .

As well as a set of references:

> head DUC_data/clean_2003/references/task1_ref0.txt
Magnetic treatment may ease or lessen occurrence of schizophrenic voices.
Evidence shows schizophrenia caused by gene abnormalities of Chromosome 1.
Researchers examining evidence of link between schizophrenia and nicotine addiction.
Scientists focusing on chemical environment of brain to understand schizophrenia.
Schizophrenia study shows disparity between what's known and what's provided to patients.

System output should be added to the directory system/task1_{name}.txt. For instance the script includes a baseline PREFIX system.

DUC_data/clean_2003/references/task1_prefix.txt

ROUGE for Eval

To evaluate the summaries you will need the ROUGE eval system.

The ROUGE script requires output in a very complex HTML form. To simplify this process we include a script to convert the simple output to one that ROUGE can handle.

Export the ROUGE directory export ROUGE={path_to_rouge} and then run the eval scripts

> ./DUC/eval.sh DUC_data/clean_2003/
FULL LENGTH
   ---------------------------------------------
   prefix ROUGE-1 Average_R: 0.17831 (95%-conf.int. 0.16916 - 0.18736)
   prefix ROUGE-1 Average_P: 0.15445 (95%-conf.int. 0.14683 - 0.16220)
   prefix ROUGE-1 Average_F: 0.16482 (95%-conf.int. 0.15662 - 0.17318)
   ---------------------------------------------
   prefix ROUGE-2 Average_R: 0.04936 (95%-conf.int. 0.04420 - 0.05452)
   prefix ROUGE-2 Average_P: 0.04257 (95%-conf.int. 0.03794 - 0.04710)
   prefix ROUGE-2 Average_F: 0.04550 (95%-conf.int. 0.04060 - 0.05026)

Tuning Feature Weights

For our system ABS+ we additionally tune extractive features on the DUC summarization data. The final features we obtained our distributed with the system as tuning/params.best.txt.

The MERT tuning code itself is located in the tuning/ directory. Our setup uses ZMert for this process.

It should be straightforward to tune the system on any developments summarization data. Take the following steps to run tuning on the DUC-2003 data set described above.

First copy over reference files to the tuning directoy. For instance to tune on DUC-2003:

ln -s DUC_data/clean_2003/references/task1_ref0.txt tuning/ref.0
ln -s DUC_data/clean_2003/references/task1_ref1.txt tuning/ref.1
ln -s DUC_data/clean_2003/references/task1_ref2.txt tuning/ref.2
ln -s DUC_data/clean_2003/references/task1_ref3.txt tuning/ref.3

Next copy the SDecoder template, cp SDecoder_cmd.tpl SDecoder_cmd.py and modify the SDecoder_cmd.py to point to the model and input text.

{"model" : "model.th",
 "src" : "/data/users/sashar/DUC_data/clean_2003/input.txt",
 "title_len" : 14}

Now you should be able to run Z-MERT and let it do its thing.

> cd tuning/; java -cp zmert/lib/zmert.jar ZMERT ZMERT_cfg.txt

When Z-MERT has finished you can run on new data using command:

> python SDecoder_test.py input.txt model.th

More Repositories

1

draft-js

A React framework for building text editors.
JavaScript
22,506
star
2

pop

An extensible iOS and OS X animation library, useful for physics-based interactions.
Objective-C++
19,716
star
3

flux

Application Architecture for Building User Interfaces
JavaScript
17,397
star
4

prepack

A JavaScript bundle optimizer.
JavaScript
14,271
star
5

AsyncDisplayKit

Smooth asynchronous user interfaces for iOS apps.
Objective-C++
13,447
star
6

stetho

Stetho is a debug bridge for Android applications, enabling the powerful Chrome Developer Tools and much more.
Java
12,653
star
7

Shimmer

An easy way to add a simple, shimmering effect to any view in an iOS app.
Objective-C
9,375
star
8

react-360

Create amazing 360 and VR content using React
JavaScript
8,702
star
9

caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
Shell
8,420
star
10

nuclide

An open IDE for web and native mobile development, built on top of Atom
JavaScript
7,816
star
11

KVOController

Simple, modern, thread-safe key-value observing for iOS and OS X.
Objective-C
7,359
star
12

three20

Three20 is an Objective-C library for iPhone developers
Objective-C
7,265
star
13

xctool

An extension for Apple's xcodebuild that makes it easier to test iOS and macOS apps.
Objective-C
6,954
star
14

fbctf

Platform to host Capture the Flag competitions
Hack
6,495
star
15

rebound

A Java library that models spring dynamics and adds real world physics to your app.
Java
5,444
star
16

Keyframes

A library for converting Adobe AE shape based animations to a data format and playing it back on Android and iOS devices.
JavaScript
5,343
star
17

shimmer-android

An easy, flexible way to add a shimmering effect to any view in an Android app.
Java
5,265
star
18

grace

Graceful restart & zero downtime deploy for Go servers.
Go
4,899
star
19

Tweaks

An easy way to fine-tune, and adjust parameters for iOS apps in development.
Objective-C
4,751
star
20

augmented-traffic-control

Augmented Traffic Control: A tool to simulate network conditions
Python
4,331
star
21

fixed-data-table

A React table component designed to allow presenting thousands of rows of data.
JavaScript
4,314
star
22

WebDriverAgent

A WebDriver server for iOS that runs inside the Simulator.
Objective-C
4,096
star
23

huxley

A testing system for catching visual regressions in Web applications.
Python
4,086
star
24

codemod

Codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention. Codemod was developed at Facebook and released as open source.
Python
4,069
star
25

scribe

Scribe is a server for aggregating log data streamed in real time from a large number of servers.
C++
3,932
star
26

FBMemoryProfiler

iOS tool that helps with profiling iOS Memory usage.
Objective-C
3,417
star
27

mention-bot

Automatically mention potential reviewers on pull requests.
JavaScript
3,371
star
28

facebook-php-sdk

This SDK is deprecated. Find the new SDK here: https://github.com/facebook/facebook-php-sdk-v4
PHP
3,289
star
29

origami

A Quartz Composer framework that enables interactive design prototyping without programming.
Objective-C
3,280
star
30

RakNet

RakNet is a cross platform, open source, C++ networking engine for game programmers.
HTML
3,211
star
31

network-connection-class

Listen to current network traffic in the app and categorize the quality of the network.
Java
3,178
star
32

beringei

Beringei is a high performance, in-memory storage engine for time series data.
C++
3,159
star
33

php-graph-sdk

The Facebook SDK for PHP provides a native interface to the Graph API and Facebook Login. https://developers.facebook.com/docs/php
PHP
3,146
star
34

react-native-fbsdk

A React Native wrapper around the Facebook SDKs for Android and iOS. Provides access to Facebook login, sharing, graph requests, app events etc.
Java
2,993
star
35

python-instagram

Python Client for Instagram API
Python
2,966
star
36

conceal

Conceal provides easy Android APIs for performing fast encryption and authentication of data.
C++
2,966
star
37

webscalesql-5.6

WebScaleSQL, Version 5.6, based upon the MySQL-5.6 community releases.
C++
2,954
star
38

ios-snapshot-test-case

Snapshot view unit tests for iOS
Objective-C
2,674
star
39

device-year-class

A library that analyzes an Android device's specifications and calculates which year the device would be considered "high end”.
Java
2,581
star
40

BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2,497
star
41

pfff

Tools for code analysis, visualizations, or style-preserving source transformation.
OCaml
2,439
star
42

fb.resnet.torch

Torch implementation of ResNet from http://arxiv.org/abs/1512.03385 and training scripts
Lua
2,243
star
43

redux-react-hook

React Hook for accessing state and dispatch from a Redux store
TypeScript
2,164
star
44

Surround360

Surround360 is Facebook's open source hardware and software for capturing stereoscopic 3D 360 video for VR. The repo contains hardware designs, as well as software for camera control and rendering.
C++
2,153
star
45

xcbuild

Xcode-compatible build tool.
C++
2,000
star
46

LogDevice

Distributed storage for sequential data
C++
1,888
star
47

MemNN

Memory Networks implementations
Lua
1,757
star
48

rebound-js

Spring dynamics in JavaScript.
JavaScript
1,754
star
49

redis-faina

A query analyzer that parses Redis' MONITOR command for counter/timing stats about query patterns
Python
1,749
star
50

fb-flo

A Chrome extension that lets you modify running apps without reloading them.
JavaScript
1,692
star
51

planout

PlanOut is a library and interpreter for designing online experiments.
JavaScript
1,664
star
52

libphenom

An eventing framework for building high performance and high scalability systems in C.
C
1,662
star
53

flashcache

A general purpose, write-back block cache for Linux.
C
1,601
star
54

python-nubia

A command-line and interactive shell framework.
Python
1,595
star
55

profilo

A library for performance traces from production.
C
1,577
star
56

facebook-swift-sdk

Integrate your iOS apps in Swift with Facebook Platform.
Swift
1,519
star
57

instagram-ruby-gem

The official gem for the Instagram API
Ruby
1,461
star
58

inject

Package inject provides a reflect based injector.
Go
1,393
star
59

Flicks

A unit of time defined in C++.
C++
1,388
star
60

duckling_old

Deprecated in favor of https://github.com/facebook/duckling
Clojure
1,322
star
61

connect-js

Legacy JavaScript SDK
JavaScript
1,237
star
62

atom-in-orbit

Putting Atom in the browser
JavaScript
1,183
star
63

phpsh

A read-eval-print-loop for php
Emacs Lisp
1,160
star
64

C3D

C3D is a modified version of BVLC caffe to support 3D ConvNets.
Jupyter Notebook
1,159
star
65

sublime-react

Sublime Text helpers for React. Syntax highlighting DEPRECATED in favor of babel/babel-sublime
JavaScript
1,144
star
66

fb-adb

A better shell for Android devices
C
1,139
star
67

iTorch

IPython kernel for Torch with visualization and plotting
Jupyter Notebook
1,104
star
68

FBAllocationTracker

iOS library that helps tracking all allocated Objective-C objects
Objective-C++
1,094
star
69

fbcunn

Facebook's extensions to torch/cunn.
Lua
1,069
star
70

emitter

A JS EventEmitter foundation for evented code
JavaScript
1,041
star
71

bistro

Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
C++
1,040
star
72

relay-starter-kit

Barebones starting point for a Relay application.
JavaScript
1,017
star
73

torchnet

Torch on steroids
Lua
992
star
74

react-meteor

React rendering for Meteor apps
JavaScript
953
star
75

atom-ide-ui

A collection of user interfaces for Atom IDE.
JavaScript
936
star
76

nifty

Thrift on Netty
Java
899
star
77

swift

An annotation-based Java library for creating Thrift serializable types and services.
Java
889
star
78

bAbI-tasks

Task generation for testing text understanding and reasoning
Lua
886
star
79

hadoop-20

Facebook's Realtime Distributed FS based on Apache Hadoop 0.20-append
Java
876
star
80

loop

A method to generate speech across multiple speakers
Python
872
star
81

IGInterfaceDataTable

A category on WKInterfaceTable that makes configuring tables with multi-dimensional data easier.
Objective-C
837
star
82

mononoke

A Mercurial source control server, specifically designed to support large monorepos.
822
star
83

react-page

Easy Application Development with React JavaScript
JavaScript
795
star
84

f8DeveloperConferenceApp

[Archive] f8 2014 Conference App
HTML
761
star
85

nailgun

Nailgun is a client, protocol, and server for running Java programs from the command line without incurring the JVM startup overhead.
Java
734
star
86

WEASEL

DNS covert channel implant for Red Teams.
Python
725
star
87

RiftDK1

Firmware, Schematics, and Mechanicals for the Oculus Rift Development Kit 1
C
688
star
88

jcommon

concurrency, collections, stats/analytics, config, testing, etc
Java
677
star
89

proguard

A fork of ProGuard.
Java
661
star
90

bootstrapped

Generate bootstrapped confidence intervals for A/B testing in Python.
Python
631
star
91

ig-lazy-module-loader

Library that implements module lazy loading.
Java
630
star
92

opencompute

A community of engineers whose mission is to design and enable the delivery of the most efficient server, storage and data center hardware designs for scalable computing.
TeX
624
star
93

flint

An open-source lint program for C++ developed by, and formerly used at Facebook.
D
622
star
94

fblualib

Facebook libraries and utilities for Lua
Lua
615
star
95

remodel

Remodel is a tool that helps iOS and OS X developers avoid repetitive code by generating Objective-C models that support coding, value comparison, and immutability.
TypeScript
609
star
96

eyescream

natural image generation using ConvNets
Lua
599
star
97

react-python

Python bridge to JSX & the React JavaScript library.
Python
576
star
98

spacetime

Experimental iOS library for live transformations on parts of layers.
Objective-C
528
star
99

warp

A fast preprocessor for C and C++
D
521
star
100

FBNotifications

Facebook Analytics In-App Notifications Framework
Objective-C
494
star