• Stars
    star
    153
  • Rank 234,908 (Top 5 %)
  • Language
    Python
  • Created about 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)

Recurrent Convolutional Neural Network for Text Classification

Tensorflow implementation of "Recurrent Convolutional Neural Network for Text Classification".

rcnn

Data: Movie Review

  • Movie reviews with one sentence per review. Classification involves detecting positive/negative reviews (Pang and Lee, 2005).
  • Download "sentence polarity dataset v1.0" at the Official Download Page.
  • Located in "data/rt-polaritydata/" in my repository.
  • rt-polarity.pos contains 5331 positive snippets.
  • rt-polarity.neg contains 5331 negative snippets.

Implementation of Recurrent Structure

recurrent_structure

  • Bidirectional RNN (Bi-RNN) is used to implement the left and right context vectors.
  • Each context vector is created by shifting the output of Bi-RNN and concatenating a zero state indicating the start of the context.

Usage

Train

  • positive data is located in "data/rt-polaritydata/rt-polarity.pos".

  • negative data is located in "data/rt-polaritydata/rt-polarity.neg".

  • "GoogleNews-vectors-negative300" is used as pre-trained word2vec model.

  • Display help message:

     $ python train.py --help
  • Train Example:

     $ python train.py --cell_type "lstm" \
     --pos_dir "data/rt-polaritydata/rt-polarity.pos" \
     --neg_dir "data/rt-polaritydata/rt-polarity.neg"\
     --word2vec "GoogleNews-vectors-negative300.bin"

Evalutation

  • Movie Review dataset has no test data.

  • If you want to evaluate, you should make test dataset from train data or do cross validation. However, cross validation is not implemented in my project.

  • The bellow example just use full rt-polarity dataset same the train dataset.

  • Evaluation Example:

     $ python eval.py \
     --pos_dir "data/rt-polaritydata/rt-polarity.pos" \
     --neg_dir "data/rt-polaritydata/rt-polarity.neg" \
     --checkpoint_dir "runs/1523902663/checkpoints"

Result

  • Comparision between Recurrent Convolutional Neural Network and Convolutional Neural Network.
  • dennybritz's cnn-text-classification-tf is used for compared CNN model.
  • Same pre-trained word2vec used for both models.

Accuracy for validation set

accuracy

Loss for validation set

accuracy

Reference

  • Recurrent Convolutional Neural Network for Text Classification (AAAI 2015), S Lai et al. [paper]