• Stars
    star
    881
  • Rank 51,820 (Top 2 %)
  • Language
    Jupyter Notebook
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Above is an adversarial example: the slightly perturbed image of the cat fools an InceptionV3 classifier into classifying it as "guacamole". Such "fooling images" are easy to synthesize using gradient descent (Szegedy et al. 2013).

In our recent paper, we evaluate the robustness of nine papers accepted to ICLR 2018 as non-certified white-box-secure defenses to adversarial examples. We find that seven of the nine defenses provide a limited increase in robustness and can be broken by improved attack techniques we develop.

Below is Table 1 from our paper, where we show the robustness of each accepted defense to the adversarial examples we can construct:

Defense Dataset Distance Accuracy
Buckman et al. (2018) CIFAR 0.031 (linf) 0%*
Ma et al. (2018) CIFAR 0.031 (linf) 5%
Guo et al. (2018) ImageNet 0.05 (l2) 0%*
Dhillon et al. (2018) CIFAR 0.031 (linf) 0%
Xie et al. (2018) ImageNet 0.031 (linf) 0%*
Song et al. (2018) CIFAR 0.031 (linf) 9%*
Samangouei et al. (2018) MNIST 0.005 (l2) 55%**
Madry et al. (2018) CIFAR 0.031 (linf) 47%
Na et al. (2018) CIFAR 0.015 (linf) 15%

(Defenses denoted with * also propose combining adversarial training; we report here the defense alone. See our paper, Section 5 for full numbers. The fundemental principle behind the defense denoted with ** has 0% accuracy; in practice defense imperfections cause the theoretically optimal attack to fail, see Section 5.4.2 for details.)

The only defense we observe that significantly increases robustness to adversarial examples within the threat model proposed is "Towards Deep Learning Models Resistant to Adversarial Attacks" (Madry et al. 2018), and we were unable to defeat this defense without stepping outside the threat model. Even then, this technique has been shown to be difficult to scale to ImageNet-scale (Kurakin et al. 2016). The remainder of the papers (besides the paper by Na et al., which provides limited robustness) rely either inadvertently or intentionally on what we call obfuscated gradients. Standard attacks apply gradient descent to maximize the loss of the network on a given image to generate an adversarial example on a neural network. Such optimization methods require a useful gradient signal to succeed. When a defense obfuscates gradients, it breaks this gradient signal and causes optimization based methods to fail.

We identify three ways in which defenses cause obfuscated gradients, and construct attacks to bypass each of these cases. Our attacks are generally applicable to any defense that includes, either intentionally or or unintentionally, a non-differentiable operation or otherwise prevents gradient signal from flowing through the network. We hope future work will be able to use our approaches to perform a more thorough security evaluation.

Paper

Abstract:

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

For details, read our paper.

Source code

This repository contains our instantiations of the general attack techniques described in our paper, breaking 7 of the ICLR 2018 defenses. Some of the defenses didn't release source code (at the time we did this work), so we had to reimplement them.

Citation

@inproceedings{obfuscated-gradients,
  author = {Anish Athalye and Nicholas Carlini and David Wagner},
  title = {Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples},
  booktitle = {Proceedings of the 35th International Conference on Machine Learning, {ICML} 2018},
  year = {2018},
  month = jul,
  url = {https://arxiv.org/abs/1802.00420},
}

More Repositories

1

dotbot

A tool that bootstraps your dotfiles โšก๏ธ
Python
7,030
star
2

neural-style

Neural style in TensorFlow! ๐ŸŽจ
Python
5,539
star
3

git-remote-dropbox

A transparent bridge between Git and Dropbox - use a Dropbox (shared) folder as a Git remote! ๐ŸŽ
Python
3,055
star
4

lumen

Magic auto brightness based on screen contents ๐Ÿ’ก
Objective-C
2,286
star
5

gemini

Gemini is a modern LaTex beamerposter theme ๐Ÿ–ผ
TeX
982
star
6

porcupine

A fast linearizability checker written in Go ๐Ÿ”Ž
Go
883
star
7

seashells

The official client for seashells.io ๐Ÿš
Python
707
star
8

dotfiles

~anish โ€ข powered by https://github.com/anishathalye/dotbot ๐Ÿ’พ
Shell
704
star
9

neural-hash-collider

Preimage attack against NeuralHash ๐Ÿ’ฃ
Python
664
star
10

gavel

A project expo judging system ๐Ÿ“Š
Python
433
star
11

periscope

Periscope gives you "duplicate vision" to help you organize and de-duplicate your files without losing data ๐Ÿ”ญ
Go
379
star
12

auriga

Auriga is a minimalist LaTeX beamer presentation theme ๐Ÿ“ฝ
TeX
328
star
13

offix

"Who is in the office?" ๐Ÿ‘€
JavaScript
186
star
14

dotfiles_template

A template for structuring dotfiles (using Dotbot as an installer) ๐Ÿ“œ
PowerShell
184
star
15

ribosome

Synthesize photos from PhotoDNA using machine learning ๐ŸŒฑ
Python
141
star
16

imagenet-simple-labels

Simpler human-readable labels for ImageNet ๐Ÿท
122
star
17

dotfiles-local

~anish [local config] โ€ขย powered by https://github.com/anishathalye/dotbot ๐Ÿ 
Shell
75
star
18

mathematics-of-deep-learning

The Mathematics of Deep Learning, SIPB IAP 2018
Jupyter Notebook
74
star
19

proof-html

A GitHub Action to validate HTML, check links, and more โœ…
Ruby
57
star
20

seashells-server

The seashells.io server ๐Ÿš
Go
53
star
21

knox

A framework for formally verifying hardware security modules to be free of hardware, software, and timing side-channel vulnerabilities ๐Ÿ”
Racket
29
star
22

notary

Notary: A Device for Secure Transaction Approval ๐Ÿ“Ÿ
Verilog
28
star
23

hubot-group

A hubot script that expands mentions of groups ๐Ÿ‘ซ
CoffeeScript
26
star
24

synox

Rust library for program synthesis of string transformations from input-output examples ๐Ÿ”ฎ
Rust
26
star
25

knox-hsm

Circuits and hardware security modules formally verified with Knox ๐Ÿ”
Verilog
23
star
26

disposable

Create a Reddit throwaway account with the click of a button! ๐Ÿšฎ
JavaScript
23
star
27

skipchat

SkipChat - MHacks V
C
20
star
28

rtlv

Tools for reasoning about circuits in Rosette/Racket ๐Ÿ”Œ
Racket
18
star
29

micro-wwvb

A tiny WWVB station ๐Ÿ“ก
C
17
star
30

linux-bootstrap

get a debian-based system set up the way I like it, with minimal effort on my part
17
star
31

hubot-shortcut

A macro system for hubot ๐Ÿ’จ
JavaScript
16
star
32

bin2coe

A tool to convert binary files to COE files ๐Ÿ’ซ
Python
14
star
33

anishathalye

A self-updating GitHub profile ๐Ÿฏ
Python
12
star
34

hubot-conf

A simple configuration management system for hubot ๐Ÿ”ง
JavaScript
11
star
35

gitlive

the source code that powered gitlive.net
Java
11
star
36

chroniton

A tool for formally verifying constant-time software against hardware ๐Ÿ•ฐ๏ธ
Racket
10
star
37

unblock

A tiny utility to make shell pipes behave as if they have unlimited buffering โ™พ
Go
7
star
38

countdown

A simple countdown timer you can set as your homepage โฐ
HTML
5
star
39

coqioa

A formalization of IO automata in the Coq proof assistant
Coq
5
star
40

x

playground for testing stuff on github
4
star
41

assets

README assets for my GitHub projects ๐ŸŽญ
4
star
42

easy-security

Slides from the SIPB Cluedump on Low Effort High Impact Security
HTML
3
star
43

learn-pgp

Slides from the SIPB Cluedump on PGP
HTML
3
star
44

learn-git

An introduction to using Git - prepared for HackMIT / Hack Week 2015
HTML
2
star
45

deterministic-start-benchmark

Verilog
2
star
46

ipr

A formalization of information-preserving refinement (IPR) in the Coq Proof Assistant ๐Ÿงฉ
Coq
2
star
47

xclips

Rust
1
star
48

scripts

Python
1
star
49

formal-methods-tutorial-2022-10-11

Racket
1
star