• Stars
    star
    138
  • Rank 264,508 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

NeurIPS 2021, Official codes for "Efficient Training of Visual Transformers with Small Datasets".

Efficient Training of Visual Transformers with Small Datasets

Maintenance Contributing

To appear in NerIPS 2021.

[paper][Poster & Video][arXiv][code] [reviews]
Yahui Liu1,3, Enver Sangineto1, Wei Bi2, Nicu Sebe1, Bruno Lepri3, Marco De Nadai3
1University of Trento, Italy, 2Tencent AI Lab, China, 3Bruno Kessler Foundation, Italy.

Data preparation

Dataset Download Link
ImageNet train,val
CIFAR-10 all
CIFAR-100 all
SVHN train,test, extra
Oxford-Flower102 images, labels, splits
Clipart images, train_list, test_list
Infograph images, train_list, test_list
Painting images, train_list, test_list
Quickdraw images, train_list, test_list
Real images, train_list, test_list
Sketch images, train_list, test_list
  • Download the datasets and pre-processe some of them (i.e., imagenet, domainnet) by using codes in the scripts folder.
  • The datasets are prepared with the following stucture (except CIFAR-10/100 and SVHN):
dataset_name
  |__train
  |    |__category1
  |    |    |__xxx.jpg
  |    |    |__...
  |    |__category2
  |    |    |__xxx.jpg
  |    |    |__...
  |    |__...
  |__val
       |__category1
       |    |__xxx.jpg
       |    |__...
       |__category2
       |    |__xxx.jpg
       |    |__...
       |__...

Training

After prepare the datasets, we can simply start the training with 8 NVIDIA V100 GPUs:

sh train.sh

Evaluation

We can also load the pre-trained model and test the performance:

sh eval.sh

Pretrained models

For fast evaluation, we present the results of Swin-T trained with 100 epochs on various datasets as an example (Note that we save the model every 5 epochs during the training, so the attached best models may be slight different from the reported performances).

Datasets Baseline Ours
CIFAR-10 59.47 83.89
CIFAR-100 53.28 66.23
SVHN 71.60 94.23
Flowers102 34.51 39.37
Clipart 38.05 47.47
Infograph 8.20 10.16
Painting 35.92 41.86
Quickdraw 24.08 69.41
Real 73.47 75.59
Sketch 11.97 38.55

We provide a demo to download the pretrained models from Google Drive directly:

python3 ./scripts/collect_models.py

Related Work:

Acknowledgments

This code is highly based on the Swin-Transformer. Thanks to the contributors of this project.

Citation

@InProceedings{liu2021efficient,
    author    = {Liu, Yahui and Sangineto, Enver and Bi, Wei and Sebe, Nicu and Lepri, Bruno and De Nadai, Marco},
    title     = {Efficient Training of Visual Transformers with Small Datasets},
    booktitle = {Conference on Neural Information Processing Systems (NeurIPS)},
    year      = {2021}
}

If you have any questions, please contact me without hesitation (yahui.cvrs AT gmail.com).

More Repositories

1

DeepSegmentor

A Pytorch implementation of DeepCrack and RoadNet projects.
Python
256
star
2

DeepCrack

DeepCrack: A Deep Hierarchical Feature Learning Architecture for Crack Segmentation, Neurocomputing.
211
star
3

tensorflow.cifar10

The examples of image recognition with the dataset CIFAR10 via tensorflow.
Python
132
star
4

mnist

Some samples of the MNIST classifier.
Python
118
star
5

RoadNet

RoadNet: A Multi-task Benchmark Dataset for Road Detection, TGRS.
97
star
6

cifar10Dataset

Create your own dataset with the similar format with CIFAR10 in python version.
Python
87
star
7

GAN-Metrics

A collection of metrics for evaluating GAN models.
Python
55
star
8

imageBinaryDataset

C++
50
star
9

SmoothingLatentSpace

CVPR 2021, Smoothing the Disentangled Latent Style Space for Unsupervised I2I Translation
Python
41
star
10

MJP

An official Pytorch implementation of "Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers", CVPR 2023.
Python
39
star
11

DWC-GAN

DWC-GAN, ACM MM 2020.
Python
32
star
12

uaggan

A Pytorch implementation of "Unsupervised Attention-Guided Image-to-Image Translation"
Python
29
star
13

TriangleGAN

TriangleGAN, ACM MM 2019.
Python
29
star
14

Domain-Translation-Papers

Collecting papers about domain translations.
21
star
15

frechet-bert-distance

Findings of ACL 2021
Python
21
star
16

stylegan-mmuit

ISF-GAN, TMM 2022.
Python
17
star
17

SuperpixelRegionFill

Superpixels-based region filling
C++
17
star
18

RG-UNIT

RG-UNIT, ACM MM 2020.
Python
11
star
19

ImageDataAugmentation

Image data augmentation via flipping and rotation.
C++
11
star
20

FindFilesWithinFolder

Find and generate a file list of the folder.
C++
7
star
21

Activations

A list of current activation functions in deep learning.
MATLAB
7
star
22

Reweighting

Reweighting Responses, EMNLP 2018 (short, oral)
Python
4
star
23

ImageFormatConversion

A Demo of converting the single channel 16-bit images to 8-bit images.
C
2
star
24

image2binarytest

C++
2
star
25

Create-Subfolder

Create a subfolder included in the input file path.
C++
1
star
26

QImage2Mat

The conversion between Qt QImage and OpenCV Mat.
C++
1
star