• Stars
    star
    520
  • Rank 84,547 (Top 2 %)
  • Language
    Python
  • Created over 5 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

TextCNN Pytorch实现 中文文本分类 情感分析

TextCNN Pytorch实现 中文文本分类

论文

Convolutional Neural Networks for Sentence Classification

参考

依赖项

  • python3.5
  • pytorch==1.0.0
  • torchtext==0.3.1
  • jieba==0.39

词向量

https://github.com/Embedding/Chinese-Word-Vectors
(这里用的是Zhihu_QA 知乎问答训练出来的word Word2vec)

用法

python3 main.py -h

训练

python3 main.py

准确率

  • CNN-rand 随机初始化Embedding
      python main.py
      Batch[1800] - loss: 0.009499  acc: 100.0000%(128/128)
      Evaluation - loss: 0.000026  acc: 94.0000%(6616/7000)
      early stop by 1000 steps, acc: 94.0000%
    
  • CNN-static 使用预训练的静态词向量
      python main.py -static=true
      Batch[1900] - loss: 0.011894  acc: 100.0000%(128/128)
      Evaluation - loss: 0.000018  acc: 95.0000%(6679/7000)
      early stop by 1000 steps, acc: 95.0000%
    
  • CNN-non-static 微调预训练的词向量
      python main.py -static=true -non-static=true
      Batch[1500] - loss: 0.008823  acc: 99.0000%(127/128))
      Evaluation - loss: 0.000016  acc: 96.0000%(6729/7000)
      early stop by 1000 steps, acc: 96.0000%
    
  • CNN-multichannel 微调加静态
      python main.py -static=true -non-static=true -multichannel=true
      Batch[1500] - loss: 0.023020  acc: 98.0000%(126/128))
      Evaluation - loss: 0.000016  acc: 96.0000%(6744/7000)
      early stop by 1000 steps, acc: 96.0000%