• Stars
    star
    267
  • Rank 152,734 (Top 4 %)
  • Language
    Jupyter Notebook
  • Created almost 6 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Humpback Whale Identification Competition Starter Pack

The code in this repo is all you need to make a first submission to the Humpback Whale Identification Competition. It uses the FastAi library release 1.0.36.post1 for anything up to point 7 in the Navigating through the repository list below (this is important - you are likely to encounter an error if you use any other version of the library). Subsequently I switch to 1.0.39.

For additional information please refer to discussion threads on Kaggle forums: classification, feature learning, detection.

Some people reported issues with running the first_submission notebook. If you encounter the issue, you should be okay to skip to the subsequent notebooks. The one that scores 0.760 on the LB is only_known_train.ipynb.

Making first submission

  1. Install the fastai library, specifically version 1.0.36.post1. The easiest way to do it is to follow the developer install as outlined in the README of the fastai repository. Once you perform the installation, navigate to the fastai directory and execute git checkout 1.0.36.post1. You can verify that this worked by executing the following inside jupyter notebook or a Python REPL:
import fastai
fastai.__version__
  1. Clone this repository. cd into data. Download competition data by running kaggle competitions download -c humpback-whale-identification. You might need to agree to competition rules on competition website if you get a 403.
  2. Create the train directory and extract files via running mkdir train && unzip train.zip -d train
  3. Do the same for test: mkdir test && unzip test.zip -d test
  4. Open first_submission.ipynb in jupyter notebook and run all cells.

Navigating through the repository

Here is the order in which I worked on the notebooks:

  1. first_submission - getting all the basics in place
  2. new_whale_detector - binary classifer known_whale / new_whale
  3. oversample - addressing class imbalance
  4. only_known_research - how to modify the architecture and what hyperparams to use
  5. only_known_train - training on full dataset
  6. resize - resize the images before training to free up CPU
  7. siamese network - a fully working prototype of a siamese network
  8. !!! Important !!! - to make use of some of the new functionality available in fast.ai at this point I switch to 1.0.39.
  9. fluke detection - train a model to draw bounding boxes surrounding flukes
  10. !!! Important !!! - here I switch to fastai master to incorporate a bug fix, will annotate with version once a new release comes out
  11. fluke detection redux - better results, less code, works with current fastai master
  12. extract bboxes - predicted bounding box extraction in images of specified size
  13. classification and metric learning - training the for predicting whale ids, places in top 7% of the competition

More Repositories

1

quickdraw

Jupyter Notebook
168
star
2

aiquizzes-anki

151
star
3

yolo_open_images

yolov3 with SPP weights pretrained on Open Images dataset along with config files
98
star
4

ask_ai

Jupyter Notebook
97
star
5

dogs_vs_cats

Jupyter Notebook
75
star
6

cifar10_docker

Jupyter Notebook
52
star
7

rsna-intracranial

Jupyter Notebook
49
star
8

personalized_fashion_recs

Jupyter Notebook
42
star
9

python_musings

Jupyter Notebook
36
star
10

aws-setup

Shell
28
star
11

nvt_op_examples

Jupyter Notebook
27
star
12

machine_learning_notebooks

Jupyter Notebook
27
star
13

fastai-rails

Jupyter Notebook
25
star
14

10_neural_nets

Python
24
star
15

tgs_salt_solution

Jupyter Notebook
23
star
16

refactoring

Ideas and theory on how one might want to go about writing code.
Jupyter Notebook
11
star
17

presidential

Jupyter Notebook
11
star
18

python_shorts

Jupyter Notebook
10
star
19

training_a_CNN_with_little_data

Design and train a CNN with few training examples using data augmentation and pseudo labeling with keras.
Jupyter Notebook
8
star
20

personal-site

a one page personal site built with mvp.css
HTML
7
star
21

meta_notebook

Treat Jupyter notebooks as lego bricks to create something beautiful.
Python
4
star
22

serve-markdown

Ruby
4
star
23

ACT_refactor

Jupyter Notebook
4
star
24

paddy_doctor

Jupyter Notebook
2
star
25

utils

Python
2
star
26

zen_dataset

Library for assembling pytorch dataset.
Python
2
star
27

error_surface_vs_generalizability

An experiment to see if smoothness of the surrounding error surface helps with generalization.
Jupyter Notebook
1
star
28

git_course_completion_bell

1
star
29

universe

Shell
1
star
30

imagenette-LB-entry

Jupyter Notebook
1
star
31

answers

HTML
1
star