• Stars
    star
    1,755
  • Rank 26,530 (Top 0.6 %)
  • Language
    C++
  • Created over 9 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.

Kaggle_CrowdFlower

1st Place Solution for Search Results Relevance Competition on Kaggle

The best single model we have obtained during the competition was an XGBoost model with linear booster of Public LB score 0.69322 and Private LB score 0.70768. Our final winning submission was a median ensemble of 35 best Public LB submissions. This submission scored 0.70807 on Public LB and 0.72189 on Private LB.

What's New

FlowChart

FlowChart

Documentation

See ./Doc/Kaggle_CrowdFlower_ChenglongChen.pdf for documentation.

Instruction

  • download data from the competition website and put all the data into folder ./Data.
  • run python ./Code/Feat/run_all.py to generate features. This will take a few hours.
  • run python ./Code/Model/generate_best_single_model.py to generate best single model submission. In our experience, it only takes a few trials to generate model of best performance or similar performance. See the training log in ./Output/Log/[Pre@solution]_[Feat@svd100_and_bow_Jun27]_[Model@reg_xgb_linear]_hyperopt.log for example.
  • run python ./Code/Model/generate_model_library.py to generate model library. This is quite time consuming. But you don't have to wait for this script to finish: you can run the next step once you have some models trained.
  • run python ./Code/Model/generate_ensemble_submission.py to generate submission via ensemble selection.
  • if you don't want to run the code, just submit the file in ./Output/Subm.

More Repositories

1

tensorflow-DeepFM

Tensorflow implementation of DeepFM for CTR prediction.
Python
2,015
star
2

kaggle-HomeDepot

3rd Place Solution for HomeDepot Product Search Results Relevance Competition on Kaggle.
Python
464
star
3

pytorch-DRL

PyTorch implementations of various Deep Reinforcement Learning (DRL) algorithms for both single agent and multi-agent.
Python
405
star
4

tensorflow-XNN

4th Place Solution for Mercari Price Suggestion Competition on Kaggle using DeepFM variant.
Python
278
star
5

tensorflow-DSMM

Tensorflow implementations of various Deep Semantic Matching Models (DSMM).
Python
228
star
6

tensorflow-LTR

Tensorflow implementations of various Learning to Rank (LTR) algorithms.
Python
218
star
7

caffe-windows

Caffe Windows with realtime data augmentation
C++
88
star
8

word2vec_cbow

this is a high performance cuda porting of cbow model of word2vec
Cuda
43
star
9

Kaggle_Walmart-Recruiting-Store-Sales-Forecasting

R code for Kaggle's Walmart Recruiting - Store Sales Forecasting
R
41
star
10

batch_normalization

Batch Normalization Layer for Caffe
C++
35
star
11

Kaggle_The_Hunt_for_Prohibited_Content

4th Place Solution for The Hunt for Prohibited Content Competition on Kaggle (http://www.kaggle.com/c/avito-prohibited-content)
Python
28
star
12

tensorflow-ASP-MTL

A Tensorflow implementation of Adversarial Shared-Private Model for Multi-Task Learning and Transfer Learning.
Python
25
star
13

Kaggle_Loan_Default_Prediction

R code for Kaggle's Loan Default Prediction - Imperial College London challenge
R
22
star
14

Kaggle_Galaxy_Zoo

Python & Theano code for Kaggle's Galaxy Zoo - The Galaxy Challenge
Python
8
star
15

image-rotation-angle-estimation

Effective estimation of image rotation angle using spectral method
MATLAB
7
star
16

tensorflow-DTN

A Tensorflow implementation of Domain Transfer Network.
Python
7
star
17

Kaggle_Higgs_Boson_Machine_Learning_Challenge

R's GBM model for Higgs Boson Machine Learning Challenge
R
6
star
18

Long-Capital

Quant Trading with Microsoft Qlib (https://github.com/microsoft/qlib)
Python
6
star
19

Kaggle_Acquire_Valued_Shoppers_Challenge

Code for Kaggle's Acquire Valued Shoppers Challenge
Python
5
star
20

Kaggle_Greek_Media_Monitoring_Multilabel_Classification

Code for Kaggles' Greek Media Monitoring Multilabel Classification (WISE 2014)
MATLAB
5
star
21

Stanford_CS229_Note

A draft note for Stanford CS229 Machine Learning course
TeX
3
star
22

GLF_Features_for_Median_Filtering_Forensics

MATLAB Toolbox for GLF Features for Median Filtering Forensics
MATLAB
2
star