• Stars
    star
    1,543
  • Rank 30,320 (Top 0.6 %)
  • Language
    Python
  • Created over 9 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for the Kaggle Ensembling Guide Article on MLWave

Kaggle-Ensemble-Guide

A combination of Model Ensembling methods that is extremely useful for increasing accuracy of Kaggle's submission. For more information: http://mlwave.com/kaggle-ensembling-guide/

Installation:

$ pip install -r requirements.txt

Example:

$ python ./src/correlations.py ./samples/method1.csv ./samples/method2.csv
Finding correlation between: ./samples/method1.csv and ./samples/method2.csv
Column to be measured: Label
Pearson's correlation score: 0.67898
Kendall's correlation score: 0.66667
Spearman's correlation score: 0.71053

$ python ./src/kaggle_vote.py "./samples/method*.csv" "./samples/kaggle_vote.csv"
parsing: ./samples/method1.csv
parsing: ./samples/method2.csv
parsing: ./samples/method3.csv
wrote to ./samples/kaggle_vote.csv


$ python ./src/kaggle_vote.py "./samples/_*.csv" "./samples/kaggle_vote_weighted.csv" "weighted"
parsing: ./samples/_w3_method1.csv
Using weight: 3
parsing: ./samples/_w2_method2.csv
Using weight: 2
parsing: ./samples/_w2_method3.csv
Using weight: 2
wrote to ./samples/kaggle_vote_weighted.csv

$ python ./src/kaggle_rankavg.py "./samples/method*.csv" "./samples/kaggle_rankavg.csv"
parsing: ./samples/method1.csv
parsing: ./samples/method2.csv
parsing: ./samples/method3.csv
wrote to ./samples/kaggle_rankavg.csv

$ python ./src/kaggle_avg.py "./samples/method*.csv" "./samples/kaggle_avg.csv"
parsing: ./samples/method1.csv
parsing: ./samples/method2.csv
parsing: ./samples/method3.csv
wrote to ./samples/kaggle_avg.csv

$ python ./src/kaggle_geomean.py  "./samples/method*.csv" "./samples/kaggle_geomean.csv"
parsing: ./samples/method1.csv
parsing: ./samples/method2.csv
parsing: ./samples/method3.csv
wrote to ./samples/kaggle_geomean.csv

Result:

==> ./samples/method1.csv <==
ImageId,Label
1,1
2,0
3,9
4,9
5,3

==> ./samples/method2.csv <==
ImageId,Label
1,2
2,0
3,6
4,2
5,3

==> ./samples/method3.csv <==
ImageId,Label
1,2
2,0
3,9
4,2
5,3

==> ./samples/kaggle_avg.csv <==
ImageId,Label
1,1.666667
2,0.000000
3,8.000000
4,4.333333
5,3.000000

==> ./samples/kaggle_rankavg.csv <==
ImageId,Label
1,0.25
2,0.0
3,1.0
4,0.5
5,0.75

==> ./samples/kaggle_vote.csv <==
ImageId,Label
1,2
2,0
3,9
4,2
5,3

==> ./samples/kaggle_geomean.csv <==
ImageId,Label
1,1.587401
2,0.000000
3,7.862224
4,3.301927
5,3.000000

More Repositories

1

extremely-simple-one-shot-learning

Extremely simple one-shot learning in Python
Python
183
star
2

kaggle-criteo

Kaggle Criteo https://www.kaggle.com/c/criteo-display-ad-challenge
Python
96
star
3

hodor-autoML

Hodor AutoML: Brute-Bandit fast good-enough solutions to a wide range of machine learning problems.
83
star
4

Online-Random-Bit-Regression-FTRL

Online Random Bit Regression with FTRL-Proximal in Python
Python
75
star
5

Black-Boxxy

Some experiments into explaining complex black box ensemble predictions.
Python
73
star
6

kaggle_acquire-valued-shoppers-challenge

Code for the Kaggle acquire valued shoppers challenge
Python
66
star
7

koolmogorov

Koolmogorov is a Python library based on CompLearn
65
star
8

RGF-sklearn

Scikit-learn API toy wrapper for Regularized Greedy Forests
Python
44
star
9

online-learning-perceptron

An online learning perceptron benchmark for Kaggle movie review competition
Python
25
star
10

Kaggle_Rotten_Tomatoes

Code to munge data between Kaggle .tsv Rotten Tomatoes Sentiment Analysis data set and Vowpal Wabbit
Python
24
star
11

normalized-compression-neighbors

Document or binary file vectorization with Normalized Compression Distance in Python.
Python
16
star
12

Kaggle_Connectomics

Python code for the Pearson Correlation Benchmark with Discretization to estimate brain connectivity from neuron activity
Python
13
star
13

Kaggle-decoding-the-human-brain

Python code to beat the benchmark for the Kaggle competition "Decoding the human brain" using Vowpal Wabbit
Python
13
star
14

Kaggle-Papirusy-z-Edhellond

Winning code for the "Papirusy z Edhellond" Kaggle in Class competition
Python
8
star
15

NCD_Chess

Normalized Compression Distance and Chess Games
Python
8
star
16

notebooks

general notebook repository
Jupyter Notebook
3
star
17

sofia-ml

Automatically exported from code.google.com/p/sofia-ml
C++
2
star
18

kepler-mapper-fork

1
star
19

classic-libqsearch

The qsearch Quartet Tree Search library for use with classic-complearn
Shell
1
star
20

stop-words

Automatically exported from code.google.com/p/stop-words
1
star