• Stars
    star
    128
  • Rank 281,044 (Top 6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Robustness and adaptation of ImageNet scale models. Pre-Release, stay tuned for updates.

PWC PWC PWC

Robustness evaluation and Adaptation of ImageNet models

This repo contains a growing collection of helper functions, tools and methods for robustness evaluation and adaptation of ImageNet scale models. The focus is on simple methods that work at scale.

We currently have the following features available:

  • examples/batchnorm: A reference implementation of batch norm adaptation used by Schneider, Rusak et al. (NeurIPS 2020)
  • examples/selflearning: A reference implementation of self learning with robust pseudo labeling used by Rusak, Schneider et al. (arxiv 2021)
  • examples/imagenet_d: Example runs on the ImageNet-D dataset used by Rusak, Schneider et al. (arxiv 2021)

Planned features for future releases are (please open an issue if you can think of additional interesting parts to add):

  • Helper functions for robustness datasets like ImageNet-A, ImageNet-R and ImageNet-C
  • examples/clip: Robustness evaluation for CLIP, Radford et al. (2021)
  • examples/dino: Robustness evaluation for DINO, Caron et al. (2021)

News

  • May '21: We will present our work on self-learing as a contributed talk at the WeaSuL 2021 workshop at ICLR.
  • April '21: The pre-print for "Adapting ImageNet-scale models to complex distribution shifts with self-learning" is now available on arXiv: arxiv.org/abs/2104.12928
  • September 2020: The BatchNorm adaptation paper was accepted for poster presentation at NeurIPS 2020.
  • July '20: A shorter workshop version of the paper was accepted for oral presentation at the Uncertainty & Robustness in Deep Learning Workshop at ICML 2020.
  • June '20: The pre-print for "Improving robustness against common corruptions by covariate shift adaptation" is available on arXiv: arxiv.org/abs/2006.16971.pdf

☕ The robusta toolbox for Robustness and Adaptation

Motivation

Besides reference implementations, this repo is mainly intended to provide a quick and easy way to adapt your own code. In particular, when developing new methods for improving robustness on deep learning models, we find it interesting to report results after adapting your model to the test datasets. This paints a more holistic image of model robustness: Some people might be interested in ad-hoc model performance, other might be interested in the performance obtained in a transductive inference setting.

Note that the package is not intended for general purpose domain adaptation. Instead, we focus on providing simple methods that prove to be effective for ImageNet scale model adaptation at test time. The package provides helper functions that are "minimally invasive" and can easily be added to existing source code for model evaluation.

Quick Start

You can install the package by running

pip install robusta

Depending on your system setup, it can make sense to first manually install the correct torch and torchvision versions as described on the PyTorch website.

Here is an example for how to use robusta for batchnorm adaptation & robust pseudo-labeling.

    model = torchvision.models.resnet50(pretrained=True)

    # We provide implementations for ImageNet-val, ImageNetC, ImageNetR,
    # ImageNetA and ImageNetD:
    val_dataset = robusta.datasets.imagenetc.ImageNetC(
        root=dataset_folder, corruption="gaussian_blur", severity=1,
        transform=transforms.Compose([transforms.ToTensor()])
        )
    val_loader = torch.utils.data.DataLoader(
        val_dataset, batch_size=batch_size, shuffle=True)

    # We offer different options for batch norm adaptation;
    # alternatives are "ema", "batch_wise_prior", ...
    robusta.batchnorm.adapt(model, adapt_type="batch_wise")

    # The accuracy metric can be specific to the dataset:
    # For example, ImageNet-R requires remapping into 200 classes.
    accuracy_metric = val_dataset.accuracy

    # You can also easily use self-learning in your model.
    # Self-learning adaptation can be combined with batch norm adaptation, example:
    parameters = robusta.selflearning.adapt(model, adapt_type="affine")
    optimizer = torch.optim.SGD(parameters, lr=1e-3)

    # You can choose from a set of adaptation losses (GCE, Entropy, ...)
    rpl_loss = robusta.selflearning.GeneralizedCrossEntropy(q=0.8)

    acc1_sum, acc5_sum, num_samples = 0., 0., 0.
    for epoch in range(num_epochs):
        predictions = []
        for images, targets in val_loader:

            logits = model(images)
            predictions = logits.argmax(dim=1)

            # Predictions are optional. If you do not specify them,
            # they will be computed within the loss function.
            loss = rpl_loss(logits, predictions)

            # When using self-learning, you need to add an additional optimizer
            # step in your evaluation loop.
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            acc1_sum, acc5_sum += accuracy_metric(predictions, targets, topk=(1,5))
            num_samples += len(targets)
            print(f"Top-1: {acc1_sum/num_samples}, Top-5: {acc5_sum/num_samples}")

Example Implementations

Batch Norm Adaptation

PWC PWC

[Paper] [Web] [README] [Implementation]

We propose to go beyond the assumption of a single sample from the target domain when evaluating robustness. Re-computing BatchNorm statistics is a simple baseline algorithm for improving the corruption error up to 14% points over a wide range of models, when access to more than a single sample is possible.

Self-Learning

PWC PWC PWC

[Paper] [Web] [README] [Implementation]

Test-time adaptation with self-learning improves robustness of large-scale computer vision models on ImageNet-C, -R, and -A.

Robustness evaluation of DINO models

[Blog Post] [Implementation coming soon]

License

Unless noted otherwise, code in this repo is released under an Apache 2.0 license. Some parts of the implementation use third party code. We typically indicate this in the file header or in the methods directly, and include the original license in the NOTICE file.

This repo does not contain the full code-base used in Rusak, Schneider et al. (2021) and is instead currently limited to a reference re-implementation for robust-pseudo labeling and entropy minimization. A full version of the codebase might be independently released in the future.

If you want to use part of this code commercially, please carefully check the involved parts. Part of the third-party implementations might be released under licences with a non-commercial use clause such as CC-NC. If in doubt, please reach out.

Contact

Please reach out for feature requests. Contributions welcome!

Note: The current version of this code base is a work in progress. We still decided to do this pre-release since the core methods are conceptually easy to use in your own code (batch norm adaptation, self-learning, ... and the current state might already be a useful place to start.

More Repositories

1

foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
Python
2,733
star
2

imagecorruptions

Python package to corrupt arbitrary images.
Python
409
star
3

siamese-mask-rcnn

Siamese Mask R-CNN model for one-shot instance segmentation
Jupyter Notebook
346
star
4

model-vs-human

Benchmark your model on out-of-distribution datasets with carefully collected human comparison data (NeurIPS 2021 Oral)
Python
333
star
5

robust-detection-benchmark

Code, data and benchmark from the paper "Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming" (NeurIPS 2019 ML4AD)
Jupyter Notebook
182
star
6

stylize-datasets

A script that applies the AdaIN style transfer method to arbitrary datasets
Python
155
star
7

openimages2coco

Convert Open Images annotations into MS Coco format to make it a drop in replacement
Jupyter Notebook
112
star
8

slow_disentanglement

Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding
Jupyter Notebook
72
star
9

frequency_determines_performance

Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurIPS'24]
Jupyter Notebook
71
star
10

AnalysisBySynthesis

Adversarially Robust Neural Network on MNIST.
Python
64
star
11

game-of-noise

Trained model weights, training and evaluation code from the paper "A simple way to make neural networks robust against diverse image corruptions"
Python
62
star
12

decompose

Blind source separation based on the probabilistic tensor factorisation framework
Python
43
star
13

adversarial-vision-challenge

NIPS Adversarial Vision Challenge
Python
41
star
14

CiteME

CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.
Python
35
star
15

InDomainGeneralizationBenchmark

Python
33
star
16

robust-vision-benchmark

Robust Vision Benchmark
Python
22
star
17

docker

Information and scripts to run and develop the Bethge Lab Docker containers
Makefile
20
star
18

slurm-monitoring-public

Monitor your high performance infrastructure configured over slurm using TIG stack
Python
19
star
19

google_scholar_crawler

Crawl Google scholar publications and authors
Python
12
star
20

DataTypeIdentification

Code for the ICLR'24 paper: "Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models"
11
star
21

magapi-wrapper

Wrapper around Microsoft Academic Knowledge API to retrieve MAG data
Python
10
star
22

testing_visualizations

Code for the paper " Exemplary Natural Images Explain CNN Activations Better than Feature Visualizations"
Python
10
star
23

docker-deeplearning

Development of new unified docker container (WIP)
Python
9
star
24

sort-and-search

Code for the paper: "Efficient Lifelong Model Evaluation in an Era of Rapid Progress" [NeurIPS'24]
Python
9
star
25

notorious_difficulty_of_comparing_human_and_machine_perception

Code for the three case studies: Closed Contour Detection, Synthetic Visual Reasoning Test, Recognition Gap
Jupyter Notebook
8
star
26

lifelong-benchmarks

Benchmarks introduced in the paper: "Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress"
8
star
27

tools

Shell
6
star
28

docker-jupyter-deeplearning

Docker Image with Jupyter for Deep Learning (Caffe, Theano, Lasagne, Keras)
6
star
29

docker-xserver

Docker Image with Xserver, OpenBLAS and correct user settings
Shell
2
star
30

gym-Atari-SpaceInvaders-V0

Python
1
star
31

bwki-weekly-tasks

BWKI Task of the week
Jupyter Notebook
1
star