Kaggle - Classification
"Those who cannot remember the past are condemned to repeat it." -- George Santayana
This is a compiled list of Kaggle competitions and their winning solutions for classification problems.
The purpose to complie this list is for easier access and therefore learning from the best in data science.
Literature review is a crucial yet sometimes overlooked part in data science. To avoid reinventing the wheels and get inspired on how to preprocess, engineer, and model the data, it's worth spend 1/10 to 1/5 of the project time just researching how people deal with similar problems/datasets.
Time spent on literature review is time well spent.
This is only one list of the whole compilation. For other lists of competitions and solutions, please refer to:
- Kaggle - Regression
- Kaggle - Sequence
- Kaggle - Image
- Kaggle - Miscellaneous
Hope the compilation can save you efforts and offer you insights. Enjoy!
======
Titanic: Machine Learning from Disaster
Fri 28 Sep 2012 - Sat 31 Dec 2016
Predict survival on the Titanic using Excel, Python, R & Random Forests
======
TalkingData Mobile User Demographics
Mon 11 Jul 2016 - Mon 5 Sep 2016
Get to know millions of mobile device users
======
Shelter Animal Outcomes
Mon 21 Mar 2016 - Sun 31 Jul 2016
Help improve outcomes for shelter animals
======
San Francisco Crime Classification
Tue 2 Jun 2015 – Mon 6 Jun 2016
Predict the category of crimes that occurred in the city by the bay
======
Santander Customer Satisfaction
Wed 2 Mar 2016 – Mon 2 May 2016
Which customers are happy customers?
======
BNP Paribas Cardif Claims Management
Wed 3 Feb 2016 – Mon 18 Apr 2016
Can you accelerate BNP Paribas Cardif's claims management process?
======
March Machine Learning Mania 2016
Thu 11 Feb 2016 – Tue 5 Apr 2016
Predict the 2016 NCAA Basketball Tournament
======
Telstra Network Disruptions
Wed 25 Nov 2015 – Mon 29 Feb 2016
Telstra is challenging Kagglers to predict the severity of service disruptions on their network. Using a dataset of features from their service logs, you're tasked with predicting if a disruption is a momentary glitch or a total interruption of connectivity.
======
Prudential Life Insurance Assessment
Mon 23 Nov 2015 – Mon 15 Feb 2016
By developing a predictive model that accurately classifies risk using a more automated approach, you can greatly impact public perception of the industry.
======
The Allen AI Science Challenge
Wed 7 Oct 2015 – Sat 13 Feb 2016
Using a dataset of multiple choice question and answers from a standardized 8th grade science exam, AI2 is challenging you to create a model that gets to the head of the class.
======
Airbnb New User Bookings
Wed 25 Nov 2015 – Thu 11 Feb 2016
In this recruiting competition, Airbnb challenges you to predict in which country a new user will make his or her first booking.
======
Homesite Quote Conversion
Mon 9 Nov 2015 – Mon 8 Feb 2016
Which customers will purchase a quoted insurance plan?
======
Walmart Recruiting: Trip Type Classification
Mon 26 Oct 2015 – Sun 27 Dec 2015
Walmart is challenging Kagglers to focus on the (data) science and classify customer trips using only a transactional dataset of the items they've purchased.
======
What's Cooking?
Wed 9 Sep 2015 – Sun 20 Dec 2015
Use recipe ingredients to categorize the cuisine
======
Springleaf Marketing Response
Fri 14 Aug 2015 – Mon 19 Oct 2015
Determine whether to send a direct mail piece to a customer
======
Truly Native?
Thu 6 Aug 2015 – Wed 14 Oct 2015
Predict which web pages served by StumbleUpon are sponsored
======
Flavours of Physics: Finding τ → μμμ
Mon 20 Jul 2015 – Mon 12 Oct 2015
Identify a rare decay phenomenon
======
Avito Context Ad Clicks
Tue 2 Jun 2015 – Tue 28 Jul 2015
Predict if context ads will earn a user's click
======
Crowdflower Search Results Relevance
Mon 11 May 2015 – Mon 6 Jul 2015
Predict the relevance of search results from eCommerce sites
======
West Nile Virus Prediction
Wed 22 Apr 2015 – Wed 17 Jun 2015
Predict West Nile virus in mosquitos across the city of Chicago
======
Facebook Recruiting IV: Human or Robot?
Mon 27 Apr 2015 – Mon 8 Jun 2015
The goal of this competition is to identify online auction bids that are placed by "robots", helping the site owners easily flag these users for removal from their site to prevent unfair auction activity.
======
Poker Rule Induction
Wed 3 Dec 2014 – Mon 1 Jun 2015
Determine the poker hand of five playing cards
======
Random Acts of Pizza
Thu 29 May 2014 – Mon 1 Jun 2015
Predicting altruism through free pizza
======
Otto Group Product Classification Challenge
Tue 17 Mar 2015 – Mon 18 May 2015
Classify products into the correct category
======
Forest Cover Type Prediction
Fri 16 May 2014 – Mon 11 May 2015
Use cartographic variables to classify forest categories
======
Microsoft Malware Classification Challenge (BIG 2015)
Tue 3 Feb 2015 – Fri 17 Apr 2015
Classify malware into families based on file content and characteristics
======
March Machine Learning Mania 2015
Mon 2 Feb 2015 – Tue 7 Apr 2015
Predict the 2015 NCAA Basketball Tournament
======
BCI Challenge @ NER 2015
Wed 19 Nov 2014 – Tue 24 Feb 2015
A spell on you if you cannot detect errors!
======
Click-Through Rate Prediction
Tue 18 Nov 2014 – Mon 9 Feb 2015
Predict whether a mobile ad will be clicked
======
Data Science London + Scikit-learn
Wed 6 Mar 2013 – Wed 31 Dec 2014
Scikit-learn is an open-source machine learning library for Python. Give it a try here!
======
Display Advertising Challenge
Tue 24 Jun 2014 – Tue 23 Sep 2014
Predict click-through rates on display ads
======
MLSP 2014 Schizophrenia Classification Challenge
Thu 5 Jun 2014 – Sun 20 Jul 2014
Diagnose schizophrenia using multimodal features from MRI scans
======
Greek Media Monitoring Multilabel Classification (WISE 2014)
Mon 2 Jun 2014 – Tue 15 Jul 2014
Multi-label classification of printed media articles to topics
======
KDD Cup 2014 - Predicting Excitement at DonorsChoose.org
Thu 15 May 2014 – Tue 15 Jul 2014
Predict funding requests that deserve an A+
======
Acquire Valued Shoppers Challenge
Thu 10 Apr 2014 – Mon 14 Jul 2014
Predict which shoppers will become repeat buyers
======
Allstate Purchase Prediction Challenge
Tue 18 Feb 2014 – Mon 19 May 2014
Predict a purchased policy based on transaction history
======
March Machine Learning Mania
Tue 7 Jan 2014 – Tue 8 Apr 2014
Tip off college basketball by predicting the 2014 NCAA Tournament
======
Accelerometer Biometric Competition
Tue 23 Jul 2013 – Fri 22 Nov 2013
Recognize users of mobile devices from accelerometer data
======
StumbleUpon Evergreen Classification Challenge
Fri 16 Aug 2013 – Thu 31 Oct 2013
Build a classifier to categorize webpages as evergreen or non-evergreen
======
Cause-effect pairs
Fri 29 Mar 2013 – Mon 2 Sep 2013
Given samples from a pair of variables A, B, find whether A is a cause of B.
======
Amazon.com - Employee Access Challenge
Wed 29 May 2013 – Wed 31 Jul 2013
Predict an employee's access needs, given his/her job role
======
KDD Cup 2013 - Author Disambiguation Challenge (Track 2)
Fri 19 Apr 2013 – Wed 12 Jun 2013
Identify which authors correspond to the same person
======
Predict Closed Questions on Stack Overflow
Tue 21 Aug 2012 – Sat 3 Nov 2012
Predict which new questions asked on Stack Overflow will be closed
======
Merck Molecular Activity Challenge
Thu 16 Aug 2012 – Tue 16 Oct 2012
Help develop safe and effective medicines by predicting molecular activity.
======
Data Mining Hackathon on BIG DATA (7GB) Best Buy mobile web site
Sat 18 Aug 2012 – Sun 30 Sep 2012
Predict which BestBuy product a mobile web visitor will be most interested in based on their search query or behavior over 2 years (7 GB).
======
Data Mining Hackathon on (20 mb) Best Buy mobile web site - ACM SF Bay Area Chapter
Sat 18 Aug 2012 – Sun 30 Sep 2012
Getting Started - Predict which Xbox game a visitor will be most interested in based on their search query. (20 MB)
======
Practice Fusion Diabetes Classification
Tue 10 Jul 2012 – Mon 10 Sep 2012
Identify patients diagnosed with Type 2 Diabetes
======
Personality Prediction Based on Twitter Stream
Tue 8 May 2012 – Fri 29 Jun 2012
Identify the best performing model(s) to predict personality traits based on Twitter usage
======
Predicting a Biological Response
Fri 16 Mar 2012 – Fri 15 Jun 2012
Predict a biological response of molecules from their chemical properties
======
Eye Movements Verification and Identification Competition
Tue 20 Mar 2012 – Sun 15 Apr 2012
Determine how people may be identified based on their eye movement characteristic.
======
What Do You Know?
Fri 18 Nov 2011 – Wed 29 Feb 2012
Improve the state of the art in student evaluation by predicting whether a student will answer the next test question correctly.
======
Don't Get Kicked!
Fri 30 Sep 2011 – Thu 5 Jan 2012
Predict if a car purchased at auction is a lemon (new car with defects)
======
Give Me Some Credit
Mon 19 Sep 2011 – Thu 15 Dec 2011
Improve on the state of the art in credit scoring by predicting the probability that somebody will experience financial distress in the next two years.
======
Photo Quality Prediction
Sat 29 Oct 2011 – Sun 20 Nov 2011
Given anonymized information on thousands of photo albums, predict whether a human evaluator would mark them as 'good'.
======
Don't Overfit!
Mon 28 Feb 2011 – Sun 15 May 2011
With nearly as many variables as training cases, what are the best techniques to avoid disaster?
======
Stay Alert! The Ford Challenge
Wed 19 Jan 2011 – Wed 9 Mar 2011
Driving while not alert can be deadly. The objective is to design a classifier that will detect whether the driver is alert or not alert, employing data that are acquired while driving.
======
Predict Grant Applications
Mon 13 Dec 2010 – Sun 20 Feb 2011
This task requires participants to predict the outcome of grant applications for the University of Melbourne.
======
IJCNN Social Network Challenge
Mon 8 Nov 2010 – Tue 11 Jan 2011
This competition requires participants to predict edges in an online social network. The winner will receive free registration and the opportunity to present their solution at IJCNN 2011.
======
R Package Recommendation Engine
Sun 10 Oct 2010 – Tue 8 Feb 2011
The aim of this competition is to develop a recommendation engine for R libraries (or packages). (R is opensource statistics software.)
======
INFORMS Data Mining Contest 2010
Mon 21 Jun 2010 – Sun 10 Oct 2010
The goal of this contest is to predict short term movements in stock prices. The winners of this contest will be honoured of the INFORMS Annual Meeting in Austin-Texas (November 7-10).
======
Predict HIV Progression
Tue 27 Apr 2010 – Mon 2 Aug 2010
This contest requires competitors to predict the likelihood that an HIV patient's infection will become less severe, given a small dataset and limited clinical information.
======
Forecast Eurovision Voting
Wed 7 Apr 2010 – Tue 25 May 2010
This competition requires contestants to forecast the voting for this year's Eurovision Song Contest in Norway on May 25th, 27th and 29th.
======