FGSM(Fast Gradient Sign Method)
Overview
Simple pytorch implementation of FGSM and I-FGSM
(FGSM : explaining and harnessing adversarial examples, Goodfellow et al.)
(I-FGSM : adversarial examples in the physical world, Kurakin et al.)
FGSM
I-FGSM
Dependencies
python 3.6.4
pytorch 0.3.1.post2
visdom(optional)
tensorboardX(optional)
tensorflow(optional)
Usage
- train a simple MNIST classifier
python main.py --mode train --env_name [NAME]
- load trained classifier, generate adversarial examples, and then see outputs in the output directory
python main.py --mode generate --iteration 1 --epsilon 0.03 --env_name [NAME] --load_ckpt best_acc.tar
- for a targeted attack, indicate target class number using
--target
argument(default is -1 for a non-targeted attack)
python main.py --mode generate --iteration 1 --epsilon 0.03 --target 3 --env_name [NAME] --load_ckpt best_acc.tar
Results
Non-targeted attack
from the left, legitimate examples, perturbed examples, and indication of perturbed images that changed predictions of the classifier, respectively
- non-targeted attack, iteration : 1, epsilon : 0.03
- non-targeted attack, iteration : 5, epsilon : 0.03
- non-targeted attack, iteration : 1, epsilon : 0.5
Targeted attack
from the left, legitimate examples, perturbed examples, and indication of perturbed images that led the classifier to predict an input as the target, respectively
- targeted attack(9), iteration : 1, epsilon : 0.03
- targeted attack(9), iteration : 5, epsilon : 0.03
- targeted attack(9), iteration : 1, epsilon : 0.5
References
- explaining and harnessing adversarial examples, Goodfellow et al.
- adversarial examples in the physical world, Kurakin et al.