• Stars
    star
    701
  • Rank 64,153 (Top 2 %)
  • Language
    Python
  • Created about 5 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The code for 2019 Tencent College Algorithm Contest, and the online result ranks 1st in the preliminary.

1. 题目介绍

请直接查看guide.pdf了解赛题,该项目是初赛第一名的模型。

2.模型介绍

avatar avatar avatar

3. 配置环境

  • scikit-learn
  • tqdm
  • pandas
  • numpy
  • scipy
  • tensorFlow=1.12.0 (其他版本≥1.4且不等于1.5或1.6)
  • Linux Ubuntu 16.04, 128G内存(64G应该足够),一张显卡

4.数据下载

mkdir data 
cd data
#Download data from https://pan.baidu.com/s/1ASQMms_u70psRgW_KEyT2Q 
#Password: burw
unzip algo.qq.com_641013010_testa.zip imps_log.zip user.zip
cd ..

5.数据预处理

python src/preprocess.py

6.提取特征

python src/extract_feature.py

7.转换数据格式

python src/convert_format.py

1)缺失值NA用0填充

2)将Word2Vec和DeepWalk得到的embedding拼接起来,并且掩盖到5%的广告

3)将需要用key-values的稠密特征正则化到[0,1]之间

8.训练模型

mkdir submission
python train.py