• Stars
    star
    130
  • Rank 277,575 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 4 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

Pdf: https://openreview.net/forum?id=v5gjXpmR8J

Code for our ICLR 2021 paper on outlier detection, titled SSD, without requiring class labels of in-distribution training data. We leverage recent advances in self-supervised representation learning followed by the cluster-based outlier detection to achieve competitive performance. This repository support both self-supervised training of networks and outlier detection evaluation of pre-trained networks. It also includes code for the two proposed extensions in the paper, i.e., 1) Few-shot outlier detection and 2) Extending SSD by including class labels, when available.

Getting started

Let's start by installing all dependencies.

pip install -r requirement.txt

Outlier detection with a pre-trained classifier

This is how we can evaluate the performance of a pre-trained ResNet50 classifier trained using SimCLR on the CIFAR-10 dataset.

CUDA_VISIBLE_DEVICES=$gpus_ids python -u eval_ssd.py --arch resnet50 --training-mode SimCLR --dataset cifar10 --ckpt checkpoint_path --normalize --exp-name name_of_this_experiment

  • training-mode: Choose from ("SimCLR", "SupCon", "SupCE"). This will choose the right network modules for the checkpoint.
  • arch: Choose from available architectures in models.py
  • dataset: Choose from ("cifar10", "cifar100", "svhn", "stl")
  • --normalize: If set, it will normalize input images. Use only if inputs were normalized in training too.
  • --exp-name: Experiment name. We will log results into a text file of this name.

The steps to evaluate with $SSD_k$ are exactly the same, except that now you have to also provide values for k and copies . k refers to how many outliers are available from each class of targeted OOD datasets while copies refers to the number of transformed instances created per available outlier image.

CUDA_VISIBLE_DEVICES=$gpu_id python -u eval_ssdk.py --arch resnet50 --training-mode SimCLR --dataset cifar10 --ckpt checkpoint_path --normalize --k 5 --copies 10

Training a classifier using self-supervised/supervised learning

We also support training a classifier using self-supervised, supervised or a combination of both training methods. Here is an example script to train a ResNet50 network on the CIFAR-10 dataset using SimCLR.

CUDA_VISIBLE_DEVICES=$gpus_ids python -u train.py --arch resnet50 --training-mode SimCLR --dataset cifar10 --results-dir directory_to_save_checkpoint --exp-name name_of_this_experiment --warmup --normalize

  • --training-mode: Choose from ("SimCLR", "SupCon", "SupCE"). This will choose appropriate network modules, loss functions, and trainers.
  • --warmup: We recommend using warmup when batch-size is large, which is often the case for self-supervised methods.

Choices for other arguments are similar to what we mentioned earlier in the evaluation section.

Pre-trained models

Here is the link to pre-trained models on cifar-10 dataset: https://drive.google.com/drive/folders/1Nx5tYGecvwagVz7_y8Z3FPk-ZtYttM4k?usp=sharing

These models aren't exactly identical to ones in the paper but they give fairly similar results. Here is my attempt at doing OOD detection with the SimCLR trained models on CIFAR10. CUDA_VISIBLE_DEVICES=0 python -u eval_ssd.py --arch resnet50 --training-mode SimCLR --dataset cifar10 --normalize --ckpt ./cifar10/base1/SimCLR_cifar10_resnet50_lr_0.5_decay_0.0001_bsz_900_temp_0.5_trial_0_ps__cosine_warm/last.pth

It gives the following results, which is fairly similar to ones in the paper. Most likely mistake, which gives suboptimal results, is to miss --normalize (when the model is trained with it).

In-data = cifar10, OOD = cifar100, Clusters = 1, FPR95 = 0.5078, AUROC = 0.9063240349999999, AUPR = 0.8919609510086947
In-data = cifar10, OOD = svhn, Clusters = 1, FPR95 = 0.020666871542716656, AUROC = 0.9962383988936693, AUPR = 0.9985624119973668
In-data = cifar10, OOD = texture, Clusters = 1, FPR95 = 0.14645390070921985, AUROC = 0.9761002304964539, AUPR = 0.9556574287665671
In-data = cifar10, OOD = blobs, Clusters = 1, FPR95 = 0.0467, AUROC = 0.9879078399999999, AUPR = 0.9843056376364349

Reference

If you find this work helpful, consider citing it.

@inproceedings{sehwag2021ssd,
  title={SSD:  A Unified Framework for Self-Supervised Outlier Detection},
  author={Vikash Sehwag and Mung Chiang and Prateek Mittal},
 booktitle={International Conference on Learning Representations},
 year={2021},
 url={https://openreview.net/forum?id=v5gjXpmR8J}
}

More Repositories

1

ModelPoisoning

Code for "Analyzing Federated Learning through an Adversarial Lens" https://arxiv.org/abs/1811.12470
Python
148
star
2

adv-patch-paper-list

A paper list for localized adversarial patch research
123
star
3

membership-inference-evaluation

Systematic Evaluation of Membership Inference Privacy Risks of Machine Learning Models
Python
116
star
4

hydra

Code and checkpoints of compressed networks for the paper titled "HYDRA: Pruning Adversarially Robust Neural Networks" (NeurIPS 2020) (https://arxiv.org/abs/2002.10509).
Python
88
star
5

PatchGuard

Code for paper "PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking"
Python
62
star
6

privacy-vs-robustness

Privacy Risks of Securing Machine Learning Models against Adversarial Examples
Python
44
star
7

advml-traffic-sign

Code for the 'DARTS: Deceiving Autonomous Cars with Toxic Signs' paper
Jupyter Notebook
35
star
8

PatchCleanser

Code for "PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier"
Python
34
star
9

patch-defense-leaderboard

A Leaderboard for Certifiable Robustness against Adversarial Patch Attacks
19
star
10

MVG-Mechansim

A module for the Matrix-Variate Gaussian (MVG) mechanism for differential privacy under matrix-valued query.
Python
18
star
11

unlearning-verification

verifying machine unlearning by backdooring
Python
17
star
12

DetectorGuard

Code for "DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks"
Python
14
star
13

MIAdefenseSELENA

[USENIX Security 2022] Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture
Python
13
star
14

robustness-via-transport

Python
12
star
15

OOD-Attacks

Attacks using out-of-distribution adversarial examples
Python
12
star
16

DP-RandP

[NeurIPS 2023] Differentially Private Image Classification by Learning Priors from Random Processes
Python
11
star
17

tta_risk

Python
9
star
18

ObjectSeeker

Code for "ObjectSeeker: Certifiably Robust Object Detection against Patch Hiding Attacks via Patch-agnostic Masking"
Python
9
star
19

variation-regularization

Official code for the paper "Formulating Robustness Against Unforeseen Attacks"
Jupyter Notebook
7
star
20

robust_representation_similarity

Understanding robust learning through the lens of representation similarity
Python
6
star
21

RON-Gauss

Implementation for the RON-Gauss system for non-interactive differentially-private data release.
Python
6
star
22

Rotation_BD

Code for "Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation"
Python
5
star
23

sybilfuse

C++
4
star
24

comps

exploring intentional connection migration for privacy
Python
4
star
25

SICO-tools

Code for several of the tools used in the AMC CCS Paper "SICO: Surgical Interception Attacks by Manipulating BGP Communities"
Python
3
star
26

Root-ORAM

MATLAB
3
star
27

PatchCURE

Python
2
star
28

certificate-database

This is a MySQL dump backup of a database of 1.8 million certificates and corresponding BGP data from when those certificates were issued.
SQLPL
2
star
29

RobustRAG

Python
2
star
30

proxy-distributions

Official repository for our paper titled "Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?"
1
star
31

advml_website

HTML
1
star
32

CA-vantage-point-selection

An algorithm designed to select best vantage points for use by CAs.
Python
1
star
33

Counter-Raptor-Tor-Client

C
1
star
34

PinMe

This is the repo for the project, known as PinMe, http://arsalanmosenia.com/papers/Pinme_preprint.pdf
Python
1
star
35

routing-aware-dns

A program to resolve DNS based on BGP route age.
Python
1
star
36

BGP-age-false-positive-study

This repository is designed to compute the false positives of various age-based BGP monitoring systems for use by certificate authorities.
Python
1
star
37

quicstep

Python
1
star
38

LinkMirage

LinkMirage
Python
1
star
39

LabelDP

[PETS 2022] Machine Learning with Differentially Private Labels: Mechanisms and Frameworks
Python
1
star