• This repository has been archived on 22/Mar/2021
  • Stars
    star
    120
  • Rank 294,224 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Open solution to the TGS Salt Identification Challenge

TGS Salt Identification Challenge

license

This is an open solution to the TGS Salt Identification Challenge.

Note

Unfortunately, we can no longer provide support for this repo. Hopefully, it should still work, but if it doesn't, we cannot really help.

More competitions πŸŽ‡

Check collection of public projects 🎁, where you can find multiple Kaggle competitions with code, experiments and outputs.

Our goals

We are building entirely open solution to this competition. Specifically:

  1. Learning from the process - updates about new ideas, code and experiments is the best way to learn data science. Our activity is especially useful for people who wants to enter the competition, but lack appropriate experience.
  2. Encourage more Kagglers to start working on this competition.
  3. Deliver open source solution with no strings attached. Code is available on our GitHub repository πŸ’». This solution should establish solid benchmark, as well as provide good base for your custom ideas and experiments. We care about clean code πŸ˜ƒ
  4. We are opening our experiments as well: everybody can have live preview on our experiments, parameters, code, etc. Check: TGS Salt Identification Challenge πŸ“ˆ or screen below.
Train and validation monitor πŸ“Š
training monitor

Disclaimer

In this open source solution you will find references to the neptune.ai. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ai is not necessary to proceed with this solution. You may run it as plain Python script 🐍.

How to start?

Learn about our solutions

  1. Check Kaggle forum and participate in the discussions.
  2. See solutions below:
Link to Experiments CV LB Open
solution 1 0.413 0.745 True
solution 2 0.794 0.798 True
solution 3 0.807 0.801 True
solution 4 0.802 0.809 True
solution 5 0.804 0.813 True
solution 6 0.819 0.824 True
solution 7 0.829 0.837 True
solution 8 0.830 0.845 True
solution 9 0.853 0.849 True

Start experimenting with ready-to-use code

You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

Installation

Clone repository

git clone https://github.com/minerva-ml/open-solution-salt-identification.git

Set-up environment

You can setup the project with default env variables and open NEPTUNE_API_TOKEN by running:

source Makefile

I suggest at least reading the step-by-step instructions to know what is happening.

Install conda environment salt

conda env create -f environment.yml

After it is installed you can activate/deactivate it by running:

conda activate salt
conda deactivate

Register to the neptune.ai (if you wish to use it) even if you don't register you can still see your experiment in Neptune. Just go to shared/showroom project and find it.

Set environment variables NEPTUNE_API_TOKEN and CONFIG_PATH.

If you are using the default neptune.yaml config then run:

export export CONFIG_PATH=neptune.yaml

otherwise you can change to your config.

Registered in Neptune:

Set NEPTUNE_API_TOKEN variable with your personal token:

export NEPTUNE_API_TOKEN=your_account_token

Create new project in Neptune and go to your config file (neptune.yaml) and change project name:

project: USER_NAME/PROJECT_NAME

Not registered in Neptune:

open token

export NEPTUNE_API_TOKEN=eyJhcGlfYWRkcmVzcyI6Imh0dHBzOi8vdWkubmVwdHVuZS5tbCIsImFwaV9rZXkiOiJiNzA2YmM4Zi03NmY5LTRjMmUtOTM5ZC00YmEwMzZmOTMyZTQifQ==

Create data folder structure and set data paths in your config file (neptune.yaml)

Suggested directory structure:

project
|--   README.md
|-- ...
|-- data
    |-- raw
         |-- train 
            |-- images 
            |-- masks
         |-- test 
            |-- images
         |-- train.csv
         |-- sample_submission.csv
    |-- meta
        β”‚-- depths.csv
        β”‚-- metadata.csv # this is generated
        β”‚-- auxiliary_metadata.csv # this is generated
    |-- stacking_data
        |-- out_of_folds_predictions # put oof predictions for multiple models/pipelines here
    |-- experiments
        |-- baseline # this is where your experiment files will be dumped
            |-- checkpoints # neural network checkpoints
            |-- transformers # serialized transformers after fitting
            |-- outputs # outputs of transformers if you specified save_output=True anywhere
            |-- out_of_fold_train_predictions.pkl # oof predictions on train
            |-- out_of_fold_test_predictions.pkl # oof predictions on test
            |-- submission.csv
        |-- empty_non_empty 
        |-- new_idea_exp 

in neptune.yaml config file change data paths if you decide on a different structure:

  # Data Paths
  train_images_dir: data/raw/train
  test_images_dir: data/raw/test
  metadata_filepath: data/meta/metadata.csv
  depths_filepath: data/meta/depths.csv
  auxiliary_metadata_filepath: data/meta/auxiliary_metadata.csv
  stacking_data_dir: data/stacking_data

Run experiment based on U-Net:

Prepare metadata:

python prepare_metadata.py

Training and inference. Everything happens in main.py. Whenever you try new idea make sure to change the name of the experiment:

EXPERIMENT_NAME = 'baseline'

to a new name.

python main.py

You can always change the pipeline you want ot run in the main. For example, if I want to run just training and evaluation I can change `main.py':

if __name__ == '__main__':
    train_evaluate_cv()

References

1.Lovash Loss

@InProceedings{Berman_2018_CVPR,
author = {Berman, Maxim and Rannen Triki, Amal and Blaschko, Matthew B.},
title = {The LovΓ‘sz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Get involved

You are welcome to contribute your code and ideas to this open solution. To get started:

  1. Check competition project on GitHub to see what we are working on right now.
  2. Express your interest in paticular task by writing comment in this task, or by creating new one with your fresh idea.
  3. We will get back to you quickly in order to start working together.
  4. Check CONTRIBUTING for some more information.

User support

There are several ways to seek help:

  1. Kaggle discussion is our primary way of communication.
  2. Submit an issue directly in this repo.

More Repositories

1

neptune-client

πŸ“˜ The experiment tracker for foundation model training
Python
572
star
2

open-solution-mapping-challenge

Open solution to the Mapping Challenge 🌎
Python
380
star
3

examples

πŸ“ Examples of how to use Neptune for different use cases and with various MLOps tools
Jupyter Notebook
76
star
4

blog-binary-classification-metrics

Codebase for the blog post "24 Evaluation Metrics for Binary Classification (And When to Use Them)"
Jupyter Notebook
55
star
5

neptune-notebooks

πŸ“š Jupyter Notebooks extension for versioning, managing and sharing notebook checkpoints in your machine learning and data science projects.
Python
34
star
6

neptune-mlflow

Neptune - MLflow integration 🧩 Experiment tracking with advanced UI, collaborative features, and user access management.
Python
31
star
7

neptune-contrib

This library is a location of the LegacyLogger for PyTorch Lightning.
Python
27
star
8

neptune-examples

Examples of using Neptune to keep track of your experiments (maintenance only).
Jupyter Notebook
26
star
9

kedro-neptune

πŸ“Œ Track & manage metadata, visualize & compare Kedro pipelines in a nice UI.
Python
18
star
10

neptune-r

πŸ“’ The MLOps stack component for experiment tracking (R interface)
R
14
star
11

neptune-tensorboard

Neptune - TensorBoard integration 🧩 Experiment tracking with advanced UI, collaborative features, and user access management.
Python
13
star
12

neptune-optuna

πŸš€ Optuna visualization dashboard that lets you log and monitor hyperparameter sweep live.
Python
11
star
13

blog-hyperparameter_optimization

Codebase for the series of blog posts on Medium
Python
8
star
14

model-fairness-in-practice

Materials for the ODSC West 2019 workshop "Model fairness in practice"
Jupyter Notebook
8
star
15

neptune-sklearn

Experiment tracking for scikit-learn. 🧩 Log, organize, visualize and compare model metrics, parameters, dataset versions, and more.
Python
6
star
16

neptune-tensorflow-keras

πŸ’‘ Experiment tracking for TensorFlow/Keras. Log, organize, and compare model metrics, learning curves, hyperparameters, dataset versions, and more.
Python
6
star
17

neptune-xgboost

Experiment tracking for XGBoost. 🧩 Log, organize, visualize and compare machine learning model metrics, parameters, dataset versions, and more.
Python
6
star
18

project-time-series-forecasting

Experiment tracking and model registry in the time series forecasting project
Python
5
star
19

neptune-lib

Project is deprecated. Please go to neptune-client.
Python
5
star
20

neptune-sacred

Sacred-compatible UI for experiment tracking. πŸ“Š Log and visualize machine learning model metrics, hyperparameters, code, dataset versions, and more.
Python
5
star
21

neptune-fastai

Experiment tracking for fastai. 🧩 Log, organize, visualize and compare model metrics, hyperparameters, dataset versions, and more.
Python
4
star
22

docs

Neptune documentation
Shell
4
star
23

kaggle-ieee-fraud-detection

Example of a project with experiment management
Jupyter Notebook
4
star
24

neptune-action

Continuous Integration with GitHub Actions and Neptune
Python
3
star
25

neptune-prophet

Experiment tracking for Prophet. 🧩 Log, organize, visualize and compare model parameters, forecasts, and more.
Python
3
star
26

neptune-lightgbm

Experiment tracking for LightGBM. 🧩 Log, organize, visualize and compare model metrics, parameters, dataset versions, and more.
Python
3
star
27

neptune-pytorch-lightning

PyTorch Lightning logger for experiment tracking. 🧩 Monitor model training live, track metrics & hyperparameters, visualize models, and more.
Python
3
star
28

neptune-client-experimental

Python
3
star
29

neptune-airflow

Python
2
star
30

project-images-segmentation

Experiment tracking and model registry in the images segmentation project
Jupyter Notebook
2
star
31

neptune-client-scale

Python
2
star
32

neptune-fetcher

Python
2
star
33

workshops

code for Neptune workshops
Jupyter Notebook
1
star
34

tour-pytorch

Example project with PyTorch and Neptune.
Python
1
star
35

example-project-code

Code used in the "example-project" in neptune.
Python
1
star
36

neptune-client-e2e

Python
1
star
37

automation-pipelines

Python
1
star
38

tour-scikit-learn

Example project with scikit-learn and neptune.
Python
1
star
39

neptune

Python
1
star
40

neptune-integration-template

Python
1
star
41

tour-tf-keras

Source code for project with tour-tf-keras.
Python
1
star
42

examples-r

R
1
star
43

neptune-aws

Python
1
star
44

project-nlp

Experiment tracking and model registry in the NLP project
Jupyter Notebook
1
star
45

neptune-detectron2

Experiment tracking for Detectron2. 🧩 Log, organize, visualize, and compare model metrics, hyperparameters, dataset versions, and more.
Python
1
star
46

neptune-api

Python
1
star