• Stars
    star
    860
  • Rank 52,605 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This repo contains a PyTorch implementation of a pretrained BERT model for multi-label text classification.

Bert multi-label text classification by PyTorch

This repo contains a PyTorch implementation of the pretrained BERT and XLNET model for multi-label text classification.

Structure of the code

At the root of the project, you will see:

โ”œโ”€โ”€ pybert
|  โ””โ”€โ”€ callback
|  |  โ””โ”€โ”€ lrscheduler.pyใ€€ใ€€
|  |  โ””โ”€โ”€ trainingmonitor.pyใ€€
|  |  โ””โ”€โ”€ ...
|  โ””โ”€โ”€ config
|  |  โ””โ”€โ”€ basic_config.py #a configuration file for storing model parameters
|  โ””โ”€โ”€ datasetใ€€ใ€€ใ€€
|  โ””โ”€โ”€ ioใ€€ใ€€ใ€€ใ€€
|  |  โ””โ”€โ”€ dataset.pyใ€€ใ€€
|  |  โ””โ”€โ”€ data_transformer.pyใ€€ใ€€
|  โ””โ”€โ”€ model
|  |  โ””โ”€โ”€ nnใ€€
|  |  โ””โ”€โ”€ pretrainใ€€
|  โ””โ”€โ”€ output #save the ouput of model
|  โ””โ”€โ”€ preprocessing #text preprocessing 
|  โ””โ”€โ”€ train #used for training a model
|  |  โ””โ”€โ”€ trainer.py 
|  |  โ””โ”€โ”€ ...
|  โ””โ”€โ”€ common # a set of utility functions
โ”œโ”€โ”€ run_bert.py
โ”œโ”€โ”€ run_xlnet.py

Dependencies

  • csv
  • tqdm
  • numpy
  • pickle
  • scikit-learn
  • PyTorch 1.1+
  • matplotlib
  • pandas
  • transformers=2.5.1

How to use the code

you need download pretrained bert model and xlnet model.

BERT: bert-base-uncased

XLNET: xlnet-base-cased

  1. Download the Bert pretrained model from s3

  2. Download the Bert config file from s3

  3. Download the Bert vocab file from s3

  4. Rename:

    • bert-base-uncased-pytorch_model.bin to pytorch_model.bin
    • bert-base-uncased-config.json to config.json
    • bert-base-uncased-vocab.txt to bert_vocab.txt
  5. Place model ,config and vocab file into the /pybert/pretrain/bert/base-uncased directory.

  6. pip install pytorch-transformers from github.

  7. Download kaggle data and place in pybert/dataset.

    • you can modify the io.task_data.py to adapt your data.
  8. Modify configuration information in pybert/configs/basic_config.py(the path of data,...).

  9. Run python run_bert.py --do_data to preprocess data.

  10. Run python run_bert.py --do_train --save_best --do_lower_case to fine tuning bert model.

  11. Run run_bert.py --do_test --do_lower_case to predict new data.

training

[training] 8511/8511 [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] -0.8s/step- loss: 0.0640
training result:
[2019-01-14 04:01:05]: bert-multi-label trainer.py[line:176] INFO  
Epoch: 2 - loss: 0.0338 - val_loss: 0.0373 - val_auc: 0.9922

training figure

result

---- train report every label -----
Label: toxic - auc: 0.9903
Label: severe_toxic - auc: 0.9913
Label: obscene - auc: 0.9951
Label: threat - auc: 0.9898
Label: insult - auc: 0.9911
Label: identity_hate - auc: 0.9910
---- valid report every label -----
Label: toxic - auc: 0.9892
Label: severe_toxic - auc: 0.9911
Label: obscene - auc: 0.9945
Label: threat - auc: 0.9955
Label: insult - auc: 0.9903
Label: identity_hate - auc: 0.9927

Tips

  • When converting the tensorflow checkpoint into the pytorch, it's expected to choice the "bert_model.ckpt", instead of "bert_model.ckpt.index", as the input file. Otherwise, you will see that the model can learn nothing and give almost same random outputs for any inputs. This means, in fact, you have not loaded the true ckpt for your model
  • When using multiple GPUs, the non-tensor calculations, such as accuracy and f1_score, are not supported by DataParallel instance
  • As recommanded by Jocob in his paper https://arxiv.org/pdf/1810.04805.pdf, in fine-tuning tasks, the hyperparameters are expected to set as following: Batch_size: 16 or 32, learning_rate: 5e-5 or 2e-5 or 3e-5, num_train_epoch: 3 or 4
  • The pretrained model has a limit for the sentence of input that its length should is not larger than 512, the max position embedding dim. The data flows into the model as: Raw_data -> WordPieces -> Model. Note that the length of wordPieces is generally larger than that of raw_data, so a safe max length of raw_data is at ~128 - 256
  • Upon testing, we found that fine-tuning all layers could get much better results than those of only fine-tuning the last classfier layer. The latter is actually a feature-based way

More Repositories

1

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models๏ผŒ้ซ˜่ดจ้‡ไธญๆ–‡้ข„่ฎญ็ปƒๆจกๅž‹&ๅคงๆจกๅž‹&ๅคšๆจกๆ€ๆจกๅž‹&ๅคง่ฏญ่จ€ๆจกๅž‹้›†ๅˆ
Python
4,712
star
2

BERT-NER-Pytorch

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Python
2,062
star
3

albert_pytorch

A Lite Bert For Self-Supervised Learning Language Representations
Python
708
star
4

NeZha_Chinese_PyTorch

NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Python
261
star
5

lookahead_pytorch

pytorch implement of Lookahead Optimizer
Python
188
star
6

TorchBlocks

A PyTorch-based toolkit for natural language processing
Python
151
star
7

daguan_2019_rank9

datagrand 2019 information extraction competition rank9
Python
130
star
8

BiLSTM-CRF-NER-PyTorch

This repo contains a PyTorch implementation of a BiLSTM-CRF model for named entity recognition task.
Python
120
star
9

Deep_Learning_For_Computer_Vision_With_Python

Deep Learning For Computer Vision With Python
Python
118
star
10

BERT-chinese-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for text classification.
Python
99
star
11

electra_pytorch

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Python
91
star
12

CLUE_pytorch

CLUE baseline pytorch CLUE็š„pytorch็‰ˆๆœฌๅŸบ็บฟ
Python
73
star
13

MobileBert_PyTorch

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Python
61
star
14

BERT-Attribute-Value-Extract

A Pytorch implementation of "Scaling Up Open Tagging from Tens to Thousands: Comprehension Empowered Attribute Value Extraction from Product Title" (ACL 2019).
Python
59
star
15

multi-sample_dropout_pytorch

a simple pytorch implement of Multi-Sample Dropout
Python
56
star
16

BERT-SDA

A PyTorch implementation of "Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation"
Python
56
star
17

ERNIE-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained ERNIE model for text classification.
Python
54
star
18

chinese-word2vec-pytorch

word2vec implementation for skip-gram in pytorch
Python
53
star
19

bert-sentence-similarity-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.
Python
49
star
20

label_smoothing_pytorch

pytorch implement of Label Smoothing
Python
32
star
21

EvoNorms_PyTorch

Evolving Normalization-Activation Layers
Python
19
star
22

NovoGrad-pytorch

pytorch implement of NovoGrad Optimizer
Python
18
star
23

cw2vec-pytorch

cw2vec implementation in pytorch
Python
17
star
24

train-bert-pytorch

Python
15
star
25

knowledge-driven-dialogue-lic2019-rank5

2019่ฏญ่จ€ไธŽๆ™บ่ƒฝๆŠ€ๆœฏ็ซž่ต›็ฌฌ5ๅๆ–นๆกˆ
Python
14
star
26

2021-GAIIC-Track1-idea

ๅ…จ็ƒไบบๅทฅๆ™บ่ƒฝๆŠ€ๆœฏๅˆ›ๆ–ฐๅคง่ต›ใ€่ต›้“ไธ€ใ€‘
10
star
27

pytorch_fashionMNIST_practice

ไฝฟ็”จpytorch่ฟ›่กŒๅ›พๅƒ่ฎญ็ปƒ็š„ๆจกๆฟ
Python
9
star
28

keras_learning

Jupyter Notebook
9
star
29

Contextual-Chinese-Strokes-Embeddings

Implementation of the language model for Contextual chinese strokes Embeddings with PyTorch
Python
8
star
30

lonePatient.github.io

HTML
6
star
31

kaggle-camera-model-identification

IEEE's Signal Processing Society - Camera Model Identification
Python
6
star
32

tensorflow-eager-examples

Examples of Eager Execution in tensorflow
Python
6
star
33

char-cnn-text-classification

This repo contains a PyTorch implementation of a char-level CNN model for text classification.
Python
3
star