Convolutional Neural Network for Relation Extraction
Note: This project is mostly based on https://github.com/yuhaozhang/sentence-convnet
Requirements
- Python 2.7
- Tensorflow (tested with version
0.10.0rc0-> 1.0.1) - Numpy
To download wikipedia articles (distant_supervision.py
)
- Beautifulsoup
- Pandas
- Stanford NER
*Path to Stanford-NER is specified in
ner_path
variable indistant_supervision.py
To visualize the results (visualize.ipynb
)
Data
-
data
directory includes preprocessed data:cnn-re-tf ├── ... ├── word2vec └── data ├── er # binay-classification dataset │  ├── source.txt # source sentences │  └── target.txt # target labels └── mlmi # multi-label multi-instance dataset ├── source.att # attention ├── source.left # left context ├── source.middle # middle context ├── source.right # right context ├── source.txt # source sentences └── target.txt # target labels
To reproduce:
python ./distant_supervision.py
-
word2vec
directory is empty. Please download the Google News pretrained vector data from this Google Drive link, and unzip it to the directory. It will be a.bin
file.
Usage
Preprocess
python ./util.py
It creates vocab.txt
, ids.txt
and emb.npy
files.
Training
-
Binary classification (ER-CNN):
python ./train.py --sent_len=3 --vocab_size=11208 --num_classes=2 --train_size=15000 \ --data_dir=./data/er --attention=False --multi_label=False --use_pretrain=False
-
Multi-label multi-instance learning (MLMI-CNN):
python ./train.py --sent_len=255 --vocab_size=36112 --num_classes=23 --train_size=10000 \ --data_dir=./data/mlmi --attention=True --multi_label=True --use_pretrain=True
-
Multi-label multi-instance Context-wise learning (MLMI-CONT):
python ./train_context.py --sent_len=102 --vocab_size=36112 --num_classes=23 --train_size=10000 \ --data_dir=./data/mlmi --attention=True --multi_label=True --use_pretrain=True
Caution: A wrong value for input-data-dependent options (sent_len
, vocab_size
and num_class
)
may cause an error. If you want to train the model on another dataset, please check these values.
Evaluation
python ./eval.py --train_dir=./train/1473898241
Replace the --train_dir
with the output from the training.
Run TensorBoard
tensorboard --logdir=./train/1473898241
Architecture
Results
P | R | F | AUC | init_lr | l2_reg | |
---|---|---|---|---|---|---|
ER-CNN | 0.9410 | 0.8630 | 0.9003 | 0.9303 | 0.005 | 0.05 |
MLMI-CNN | 0.8205 | 0.6406 | 0.7195 | 0.7424 | 1e-3 | 1e-4 |
MLMI-CONT | 0.8819 | 0.7158 | 0.7902 | 0.8156 | 1e-3 | 1e-4 |
*As you see above, these models somewhat suffer from overfitting ...
References
- http://github.com/yuhaozhang/sentence-convnet
- http://github.com/dennybritz/cnn-text-classification-tf
- http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
- http://tkengo.github.io/blog/2016/03/14/text-classification-by-cnn/
- Adel et al. Comparing Convolutional Neural Networks to Traditional Models for Slot Filling NAACL 2016
- Nguyen and Grishman. Relation Extraction: Perspective from Convolutional Neural Networks NAACL 2015
- Lin et al. Neural Relation Extraction with Selective Attention over Instances ACL 2016