• Stars
    star
    154
  • Rank 242,095 (Top 5 %)
  • Language
    Jupyter Notebook
  • Created about 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

2018达观杯长文本分类智能处理挑战赛 18解决方案

达观杯2018

Backers on Open Collective Sponsors on Open Collective

参数没调好,仓促比赛,单模型线上没测过,线下0.784,最终得分0.791,排名18/3462,排名不高就不多写了,等着前排分享。思路如同代码所写,很简单。

数据请在达观数据处下载,放在data目录下。

一、环境

环境/库 版本
Ubuntu 14.04.5 LTS
python 3.6
jupyter notebook 4.2.3
tensorflow-gpu 1.10.1
numpy 1.14.1
pandas 0.23.0
matplotlib 2.2.2
gensim 3.5.0
tqdm 4.24.0

二、数据预处理

都写在jupyter里了。运行src/preprocess/EDA.ipynb生成各种文件。

三、baseline模型训练

src/preprocess/中运行:

python baseline-x-cv.py

四、深度模型训练

然后直接train模型,单GPU运行,模型自选:

python train_predict.py --gpu 4 --option 5 --model convlstm --feature char

多GPU训练示例:

python train_predict.py --gpu 4,5,6,7 --option 5 --model convlstm --feature char

五、模型融合输出

python stacking.py --gpu 1 --tfidf True --option 5

这里是stacking和伪标签一起做了,请修改代码自选是否用伪标签。

Contributors

This project exists thanks to all the people who contribute. [Contribute].

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]