• Stars
    star
    7
  • Rank 2,284,518 (Top 46 %)
  • Language
    Python
  • Created almost 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

考虑到工作开始慢慢接触spark生态,学习下Spark,PySpark功能,尝试使用PySpark,将Kaggle,DataCastle,TianChi,JData,Kesci,ppd,AiChallenger上面一些高质量比赛的参赛者分享的基于Pandas和Ligthtgbm的top方案,用PySpark和LightGBM on Apache Spark来进行复现,一方面熟悉相关包的功能和接口,一方面也了解top选手的一些数据挖掘、分析的思路和套路,trick等等,工作以后能刷比赛的时间实在太少了,如果有自己参加比赛成绩较好的,也会尝试复现。包含:IEEE-CIS Fraud Detection