• Stars
    star
    100
  • Rank 328,884 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 4 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

carrier of tricks for image classification tutorials using pytorch.

carrier-of-tricks-for-classification-pytorch

carrier of tricks for image classification tutorials using pytorch. Based on "Bag of Tricks for Image Classification with Convolutional Neural Networks", 2019 CVPR Paper, implement classification codebase using custom dataset.

0. Experimental Setup (I used 1 GTX 1080 Ti GPU!)

0-1. Prepare Library

pip install -r requirements.txt

0-2. Download dataset (Kaggle Intel Image Classification)

This Data contains around 25k images of size 150x150 distributed under 6 categories. {'buildings' -> 0, 'forest' -> 1, 'glacier' -> 2, 'mountain' -> 3, 'sea' -> 4, 'street' -> 5 }

0-3. Download ImageNet-Pretrained Weights (EfficientNet, RegNet)

1. Baseline Training Setting

  • ImageNet Pretrained ResNet-50 from torchvision.models
  • 1080 Ti 1 GPU / Batch Size 64 / Epochs 120 / Initial Learning Rate 0.1
  • Training Augmentation: Resize((256, 256)), RandomHorizontalFlip()
  • SGD + Momentum(0.9) + learning rate step decay (x0.1 at 30, 60, 90 epoch)
python main.py --checkpoint_name baseline;

1-1. Simple Trials

  • Random Initialized ResNet-50 (from scratch)
python main.py --checkpoint_name baseline_scratch --pretrained 0;
  • Adam Optimizer with small learning rate(1e-4 is best!)
python main.py --checkpoint_name baseline_Adam --optimizer ADAM --learning_rate 0.0001

2. Bag of Tricks from Original Papers

Before start, i didn't try No bias decay, Low-precision Training, ResNet Model Tweaks, Knowledge Distillation.

2-1. Learning Rate Warmup

  • first 5 epochs to warmup
python main.py --checkpoint_name baseline_warmup --decay_type step_warmup;
python main.py --checkpoint_name baseline_Adam_warmup --optimizer ADAM --learning_rate 0.0001 --decay_type step_warmup;

2-2. Zero gamma in Batch Normalization

  • zero-initialize the last BN in each residual branch
python main.py --checkpoint_name baseline_zerogamma --zero_gamma ;
python main.py --checkpoint_name baseline_warmup_zerogamma --decay_type step_warmup --zero_gamma;

2-3. Cosine Learning Rate Annealing

python main.py --checkpoint_name baseline_Adam_warmup_cosine --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup;

2-4. Label Smoothing

  • In paper, use smoothing coefficient as 0.1. I will use same value.
  • The number of classes in imagenet (1000) is different from the number of classes in our dataset (6), but i didn't tune them.
python main.py --checkpoint_name baseline_Adam_warmup_cosine_labelsmooth --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1;
python main.py --checkpoint_name baseline_Adam_warmup_labelsmooth --optimizer ADAM --learning_rate 0.0001 --decay_type step_warmup --label_smooth 0.1;

2-5. MixUp Augmentation

  • MixUp paper link
  • lambda is a random number drawn from Beta(alpha, alpha) distribution.
  • I will use alpha=0.2 like paper.
python main.py --checkpoint_name baseline_Adam_warmup_mixup --optimizer ADAM --learning_rate 0.0001 --decay_type step_warmup --mixup 0.2;
python main.py --checkpoint_name baseline_Adam_warmup_cosine_mixup --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --mixup 0.2;
python main.py --checkpoint_name baseline_Adam_warmup_labelsmooth_mixup --optimizer ADAM --learning_rate 0.0001 --decay_type step_warmup --label_smooth 0.1 --mixup 0.2;
python main.py --checkpoint_name baseline_Adam_warmup_cosine_labelsmooth_mixup --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1 --mixup 0.2;

3. Additional Tricks from hoya012's survey note

3-1. CutMix Augmentation

  • CutMix paper link
  • I will use same hyper-parameter (cutmix alpha=1.0, cutmix prob=1.0) with ImageNet-Experimental Setting
python main.py --checkpoint_name baseline_Adam_warmup_cosine_cutmix --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --cutmix_alpha 1.0 --cutmix_prob 1.0;

3-2. RAdam Optimizer

python main.py --checkpoint_name baseline_RAdam_warmup_cosine_labelsmooth --optimizer RADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1;
python main.py --checkpoint_name baseline_RAdam_warmup_cosine_cutmix --optimizer RADAM --learning_rate 0.0001 --decay_type cosine_warmup --cutmix_alpha 1.0 --cutmix_prob 1.0;

3-3. RandAugment

python main.py --checkpoint_name baseline_Adam_warmup_cosine_labelsmooth_randaug --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1 --randaugment;

3-4. EvoNorm

python main.py --checkpoint_name baseline_Adam_warmup_cosine_labelsmmoth_evonorm --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1 --norm evonorm;

3-5. Other Architecture (EfficientNet, RegNet)

  • I will use EfficientNet-B2 which has similar acts with ResNet-50
    • But, because of GPU Memory, i will use small batch size (48)...
  • I will use RegNetY-1.6GF which has similar FLOPS and acts with ResNet-50
python main.py --checkpoint_name efficientnet_Adam_warmup_cosine_labelsmooth --model EfficientNet --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1;
python main.py --checkpoint_name efficientnet_Adam_warmup_cosine_labelsmooth_mixup --model EfficientNet --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1 --mixup 0.2;
python main.py --checkpoint_name efficientnet_Adam_warmup_cosine_cutmix --model EfficientNet --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --cutmix_alpha 1.0 --cutmix_prob 1.0;
python main.py --checkpoint_name efficientnet_RAdam_warmup_cosine_labelsmooth --model EfficientNet --optimizer RADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1;
python main.py --checkpoint_name efficientnet_RAdam_warmup_cosine_cutmix --model EfficientNet --optimizer RADAM --learning_rate 0.0001 --decay_type cosine_warmup --cutmix_alpha 1.0 --cutmix_prob 1.0;
python main.py --checkpoint_name regnet_Adam_warmup_cosine_labelsmooth --model RegNet --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1;
python main.py --checkpoint_name regnet_Adam_warmup_cosine_labelsmooth_mixup --model RegNet --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1 --mixup 0.2;
python main.py --checkpoint_name regnet_Adam_warmup_cosine_cutmix --model RegNet --optimizer ADAM --learning_rate 0.0001 --decay_type cosine_warmup --cutmix_alpha 1.0 --cutmix_prob 1.0;
python main.py --checkpoint_name regnet_RAdam_warmup_cosine_labelsmooth --model RegNet --optimizer RADAM --learning_rate 0.0001 --decay_type cosine_warmup --label_smooth 0.1;
python main.py --checkpoint_name regnet_RAdam_warmup_cosine_cutmix --model RegNet --optimizer RADAM --learning_rate 0.0001 --decay_type cosine_warmup --cutmix_alpha 1.0 --cutmix_prob 1.0;

4. Performance Table

  • B : Baseline

  • A : Adam Optimizer

  • W : Warm up

  • C : Cosine Annealing

  • S : Label Smoothing

  • M : MixUp Augmentation

  • CM: CutMix Augmentation

  • R : RAdam Optimizer

  • RA : RandAugment

  • E : EvoNorm

  • EN : EfficientNet

  • RN : RegNet

Algorithm Test Accuracy
B from scratch 86.47
B 89.07
B + A 94.13
B + A + W 94.57
B + A + W + C 94.20
B + A + W + S 93.67
B + A + W + C + S 93.67
B + A + W + M 94.03
B + A + W + S + M 94.27
B + A + W + C + S + M 93.73
:------------: :------------:
BAWC + CM 94.20
BWCS + R 93.97
BAWCS + RA 93.93
BAWCS + E 93.53
BWC + CM + R 94.27
:------------: :------------:
EN + AWCSM 94.07
EN + AWC + CM 94.33
EN + WCS + R 94.50
EN + WC + CM + R 94.33
:------------: :------------:
RN + AWCSM 94.57
RN + AWC + CM 94.83
RN + WCS + R 94.37
RN + WC + CM + R 94.90

5. How to run all of experiments?

6. Code Reference

More Repositories

1

deep_learning_object_detection

A paper list of object detection using deep learning.
Python
11,209
star
2

awesome-anomaly-detection

A curated list of awesome anomaly detection resources
2,631
star
3

CVPR-2019-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of CVPR 2019 accepted papers for the main Computer Vision conference (CVPR)
Jupyter Notebook
532
star
4

semantic-segmentation-tutorial-pytorch

A simple PyTorch codebase for semantic segmentation using Cityscapes.
Jupyter Notebook
187
star
5

automatic-mixed-precision-tutorials-pytorch

Automatic Mixed Precision Tutorials using pytorch. Based on PyTorch 1.6 Official Features, implement classification codebase using custom dataset.
Python
85
star
6

CVPR-2021-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of CVPR 2021 accepted papers for the main Computer Vision conference (CVPR)
Jupyter Notebook
80
star
7

ICCV-2019-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of ICCV 2019 accepted papers for the main Computer Vision conference (ICCV)
Jupyter Notebook
56
star
8

CVPR-2020-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of CVPR 2020 accepted papers for the main Computer Vision conference (CVPR)
Jupyter Notebook
33
star
9

swa-tutorials-pytorch

Stochastic Weight Averaging Tutorials using pytorch.
Python
33
star
10

fast-style-transfer-tutorial-pytorch

Simple Tutorials & Code Implementation of fast-style-transfer(Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016 ECCV) using PyTorch.
Jupyter Notebook
19
star
11

NeurIPS-2020-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of NeurIPS 2020 accepted papers
Jupyter Notebook
17
star
12

pytorch-Xception

Simple Code Implementation of "Xception" architecture using PyTorch.
Jupyter Notebook
15
star
13

pytorch-peleenet

Simple Code Implementation of "PeleeNet" architecture in "Pelee Paper" using PyTorch.
Jupyter Notebook
11
star
14

NeurIPS-2019-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of NeurIPS 2019 accepted papers
Jupyter Notebook
9
star
15

pytorch-MobileNet

Simple Code Implementation of "MobileNet" architecture using PyTorch.
Jupyter Notebook
8
star
16

shake-shake-tensorflow

Simple Code Implementation of "Shake-Shake Regularization using TensorFlow.
Python
7
star
17

CVPR-2023-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of CVPR 2023 accepted papers for the main Computer Vision conference (CVPR)
Jupyter Notebook
6
star
18

pytorch-densenet

Simple Code Implementation of "DenseNet" architecture using PyTorch.
Jupyter Notebook
5
star
19

AAAI-2020-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of AAAI 2020 accepted papers
Jupyter Notebook
4
star
20

CVPR-2022-Paper-Statistics

Statistics and Visualization of acceptance rate, main keyword of CVPR 2022 accepted papers for the main Computer Vision conference (CVPR)
Jupyter Notebook
3
star
21

pytorch_practice

A code list of practice using PyTorch
Jupyter Notebook
2
star
22

hoya012

1
star