PyTorch solution of NER task Using BiLSTM-CRF model.
This repo contains a PyTorch implementation of a BiLSTM-CRF model for named entity recognition task.
Structure of the code
At the root of the project, you will see:
โโโ pyner
| โโโ callback
| | โโโ lrscheduler.pyใใ
| | โโโ trainingmonitor.pyใ
| | โโโ ...
| โโโ config
| | โโโ basic_config.py #a configuration file for storing model parameters
| โโโ datasetใใใ
| โโโ ioใใใใ
| | โโโ data_loader.pyใใ
| | โโโ data_transformer.pyใใ
| โโโ model
| | โโโ embedding
| | โโโ layers
| | โโโ nn
| โโโ output #save the ouput of model
| โโโ preprocessing #text preprocessing
| โโโ train #used for training a model
| | โโโ trainer.py
| | โโโ ...
| โโโ utils # a set of utility functions
| โโโ test
โโโ test_predict.py
โโโ train_bilstm_crf.py
โโโ train_word2vec.py
Dependencies
- csv
- tqdm
- numpy
- pickle
- scikit-learn
- PyTorch 1.0
- matplotlib
How to use the code
- Download the
source_BIO_2014_cropus.txt
from BaiduPan(password: 1fa3) and place it into the/pyner/dataset/raW
directory. - Modify configuration information in
pyner/config/basic_config.py
(the path of data,...). - run
python train_bilstm_crf.py
๏ผ - run
python test_predict.py
๏ผ
Result
----------- Train entity score:
Type: LOC - precision: 0.9043 - recall: 0.9089 - f1: 0.9066
Type: PER - precision: 0.8925 - recall: 0.9215 - f1: 0.9068
Type: ORG - precision: 0.8279 - recall: 0.9016 - f1: 0.8632
Type: T - precision: 0.9408 - recall: 0.9462 - f1: 0.9435
----------- valid entity score:
Type: T - precision: 0.9579 - recall: 0.9558 - f1: 0.9568
Type: PER - precision: 0.9058 - recall: 0.9205 - f1: 0.9131