• Stars
    star
    344
  • Rank 118,574 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pytorch implementation of convolutional neural network adversarial attack techniques

Convolutional Neural Network Adversarial Attacks

Note: I am aware that there are some issues with the code, I will update this repository soon (Also will move away from cv2 to PIL).

This repo is a branch off of CNN Visualisations because it was starting to get bloated. It contains following CNN adversarial attacks implemented in Pytorch:

  • Fast Gradient Sign, Untargeted [1]
  • Fast Gradient Sign, Targeted [1]
  • Gradient Ascent, Adversarial Images [2]
  • Gradient Ascent, Fooling Images (Unrecognizable images predicted as classes with high confidence) [2]

It will also include more adverisarial attack and defenses techniques in the future as well.

The code uses pretrained AlexNet in the model zoo. You can simply change it with your model but don't forget to change target class parameters as well.

All images are pre-processed with mean and std of the ImageNet dataset before being fed to the model. None of the code uses GPU as these operations are quite fast (for a single image). You can make use of gpu with very little effort. The examples below include numbers in the brackets after the description, like Mastiff (243), this number represents the class id in the ImageNet dataset.

I tried to comment on the code as much as possible, if you have any issues understanding it or porting it, don't hesitate to reach out.

Below, are some sample results for each operation.

Fast Gradient Sign - Untargeted

In this operation we update the original image with signs of the received gradient on the first layer. Untargeted version aims to reduce the confidence of the initial class. The code breaks as soon as the image stops being classified as the original label.

Predicted as Eel (390)
Confidence: 0.96
Adversarial Noise Predicted as Blowfish (397)
Confidence: 0.81
Predicted as Snowbird (13)
Confidence: 0.99
Adversarial Noise Predicted as Chickadee (19)
Confidence: 0.95

Fast Gradient Sign - Targeted

Targeted version of FGS works almost the same as the untargeted version. The only difference is that we do not try to minimize the original label but maximize the target label. The code breaks as soon as the image is predicted as the target class.

Predicted as Apple (948)
Confidence: 0.95
Adversarial Noise Predicted as Rock python (62)
Confidence: 0.16
Predicted as Apple (948)
Confidence: 0.95
Adversarial Noise Predicted as Mud turtle (35)
Confidence: 0.54

Gradient Ascent - Fooling Image Generation

In this operation we start with a random image and continously update the image with targeted backpropagation (for a certain class) and stop when we achieve target confidence for that class. All of the below images are generated from pretrained AlexNet to fool it.

Predicted as Zebra (340)
Confidence: 0.94
Predicted as Bow tie (457)
Confidence: 0.95
Predicted as Castle (483)
Confidence: 0.99

Gradient Ascent - Adversarial Image Generation

This operation works exactly same as the previous one. The only important thing is that keeping learning rate a bit smaller so that the image does not receive huge updates so that it will continue to look like the originial. As it can be seen from samples, on some images it is almost impossible to recognize the difference between two images but on others it can clearly be observed that something is wrong. All of the examples below were created from and tested on AlexNet to fool it.

Predicted as Eel (390)
Confidence: 0.96
Predicted as Apple (948)
Confidence: 0.95
Predicted as Snowbird (13)
Confidence: 0.99
Predicted as Banjo (420)
Confidence: 0.99
Predicted as Abacus (398)
Confidence: 0.99
Predicted as Dumbell (543)
Confidence: 1

Requirements:

torch >= 0.2.0.post4
torchvision >= 0.1.9
numpy >= 1.13.0
opencv >= 3.1.0

References:

[1] I. J. Goodfellow, J. Shlens, C. Szegedy. Explaining and Harnessing Adversarial Examples https://arxiv.org/abs/1412.6572

[2] A. Nguyen, J. Yosinski, J. Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images https://arxiv.org/abs/1412.1897

More Repositories

1

pytorch-cnn-visualizations

Pytorch implementation of convolutional neural network visualization techniques
Python
7,631
star
2

pytorch-custom-dataset-examples

Some custom dataset examples for PyTorch
Python
857
star
3

unsupervised-learning-document-clustering

Document clustering and topic modelling with Python
Python
85
star
4

adaptive-segmentation-mask-attack

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).
Python
55
star
5

facial-expression-recognition

Predicting facial expressions with machine learning algorithms
Python
48
star
6

singular-value-decomposition-on-images

Singular Value Decomposition on rgb and grayscale images
Python
12
star
7

imagenet-adversarial-image-evaluation

Code and some materials from the papers "Selection of Source Images Heavily Influences the Effectiveness of Adversarial Attacks" (BMVC 2021) and "Evaluating Adversarial Attacks on ImageNet:A Reality Check on Misclassification Classes" (NeurIPS 2021, Workshop track).
Python
11
star
8

UCL-DSSC-Hackaton

Winning project of 2017 University College London Data Science Hackathon
Jupyter Notebook
7
star
9

regional-adversarial-perturbation

Code and materials for the paper "Regional Image Perturbation Reduces Lp Norms of Adversarial Examples While Maintaining Model-to-model Transferability" (ICML 2020, UDL).
Python
6
star
10

mutate-and-observe

Materials for the paper (Bioinformatics) "Mutate and Observe: Utilizing Deep Neural Networks to Investigate the Impact of Mutations on Translation Initiation"
Python
3
star
11

cnn-gifs

1
star