E2E-TBSA
Source code of our AAAI paper on End-to-End Target/Aspect-Based Sentiment Analysis.
Requirements
- Python 3.6
- DyNet 2.0.2 (For building DyNet and enabling the python bindings, please follow the instructions in this link)
- nltk 3.2.2
- numpy 1.13.3
Data
rest_total consist of the reviews from the SemEval-2014, SemEval-2015, SemEval-2016 restaurant datasets.- (IMPORTANT) rest14, rest15, rest16: restaurant reviews from SemEval 2014 (task 4), SemEval 2015 (task 12) and SemEval 2016 (task 5) respectively. We have prepared data files with train/dev/test split in our another project, check it out if needed.
- (IMPORTANT) DO NOT use the
rest_total
dataset built by ourselves again, more details can be found in Updated Results. - laptop14 is identical to the SemEval-2014 laptop dataset.
- twitter is built by Mitchell et al. (EMNLP 2013).
- We also provide the data in the format of conll03 NER dataset.
Parameter Settings
- To reproduce the results, please refer to the settings in
config.py
.
Environment
- OS: REHL Server 6.4 (Santiago)
- CPU: Intel Xeon CPU E5-2620 (Yes, we do not use GPU to gurantee the deterministic outputs)
Updated results (IMPORTANT)
-
The data files of the
rest_total
dataset are created by concatenating the train/test counterparts fromrest14
,rest15
andrest16
and our motivation is to build a larger training/testing dataset to stabilize the training & faithfully reflect the capability of the ABSA model. However, we recently found that the SemEval organizers directly treat the union set ofrest15.train
andrest15.test
as the training set of rest16 (i.e.,rest16.train
), and thus, there exists overlap betweenrest_total_train.txt
andrest_total_test.txt
, which makes this dataset invalid. When you follow our works on this E2E-ABSA task, we hope you DO NOT use thisrest_total
dataset any more but change to the officially releasedrest14
,rest15
andrest16
. We have prepared data files with train/dev/test split in our another project, check it out if needed. -
To facilitate the comparison in the future, we re-run our models following the settings in
config.py
and report the results (micro-averaged F1) onrest14
,rest15
andrest16
:Model rest14 rest15 rest16 E2E-ABSA (OURS) 67.10 57.27 64.31 (He et al., 2019) 69.54 59.18 - (Liu et al., 2020) 68.91 58.37 - BERT-Linear (OURS) 72.61 60.29 69.67 BERT-GRU (OURS) 73.17 59.60 70.21 BERT-SAN (OURS) 73.68 59.90 70.51 BERT-TFM (OURS) 73.98 60.24 70.25 BERT-CRF (OURS) 73.17 60.70 70.37 (Chen and Qian, 2020) 75.42 66.05 - (Liang et al., 2020) 72.60 62.37 -
Citation
If the code is used in your research, please star this repo and cite our paper as follows:
@inproceedings{li2019unified,
title={A unified model for opinion target extraction and target sentiment prediction},
author={Li, Xin and Bing, Lidong and Li, Piji and Lam, Wai},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
pages={6714--6721},
year={2019}
}