AutoEncoder with SSIM loss
This is a third party implementation of the paper Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders.
Requirement
tensorflow==2.2.0
skimage
Datasets
MVTec AD datasets https://www.mvtec.com/company/research/datasets/mvtec-ad/
Code examples
Step 1. Set the DATASET_PATH variable.
Set the DATASET_PATH to the root path of the downloaded MVTec AD dataset.
Step 2. Train SSIM-AE and Test.
- bottle object
python train.py --name bottle --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0.
python test.py --name bottle --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask W
- cable object
python train.py --name cable --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name cable --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500
- capsule object
python train.py --name capsule --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name capsule --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask W
- carpet texture
python train.py --name carpet --loss ssim_loss --im_resize 512 --patch_size 128 --z_dim 100 --do_aug --rotate_angle_vari 10
python test.py --name carpet --loss ssim_loss --im_resize 512 --patch_size 128 --z_dim 100
- grid texture
python train.py --name grid --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --grayscale --do_aug
python test.py --name grid --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --grayscale
- hazelnut object
python train.py --name hazelnut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate_crop 0.
python test.py --name hazelnut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask B
- leather texture
python train.py --name leather --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --do_aug
python test.py --name leather --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100
- metal_nut object
python train.py --name metal_nut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate_crop 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name metal_nut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask B
- pill object
python train.py --name pill --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name pill --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask B
- screw object
python train.py --name screw --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale --do_aug --p_rotate 0.
python test.py --name screw --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale --bg_mask W
- tile texture
python train.py --name tile --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --do_aug
python test.py --name tile --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100
- toothbrush object
python train.py --name toothbrush --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_vertical_flip 0.
python test.py --name toothbrush --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500
- transistor object
python train.py --name transistor --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_vertical_flip 0.
python test.py --name transistor --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500
- wood texture
python train.py --name wood --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --do_aug --rotate_angle_vari 15
python test.py --name wood --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100
- zipper object
python train.py --name zipper --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale --do_aug --p_rotate 0.
python test.py --name zipper --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale
Overview of Results
Classification
During test, I simply classify a test image as defect if there is any anomalous response on the residual map. It is strict for anomaly-free images, resulting in relatively lower accuracy in the ok
column shown as below.
Please note that the threshold makes a big difference to the outcome, which should be carefully selected.
ok | nok | average | |
bottle | 90.0 | 98.4 | 96.4 |
cable | 0.0 | 45.7 | 28.0 |
capsule | 34.8 | 89.6 | 78.0 |
carpet | 42.9 | 98.9 | 88.9 |
grid | 100 | 94.7 | 96.2 |
hazelnut | 55.0 | 98.6 | 82.7 |
leather | 71.9 | 92.4 | 87.1 |
metal nut | 22.7 | 67.7 | 59.1 |
pill | 11.5 | 75.9 | 65.9 |
screw | 0.5 | 90.0 | 68.1 |
tile | 100.0 | 3.6 | 30.8 |
toothbrush | 83.3 | 100 | 95.2 |
transistor | 23.3 | 97.5 | 53.0 |
wood | 89.5 | 76.7 | 79.7 |
zipper | 68.8 | 81.5 | 78.8 |
Discussion
- SSIM + L1 metrics
Since SSIM is a measure of similarity only between grayscale images, it cannot handle color defect in some cases. So here I use SSIM + L1 distance for anomaly segmentation. - VAE
I have tried VAE, observing no performances improvements. - InstanceNorm
I have also tried adding the IN layer for accelerating convergence, but the droplet artifact appears in some cases. It is also mentioned and discussed in StyleGAN-2 paper.
Supplementary materials
My notes https://www.yuque.com/books/share/8c7613f7-7571-4bfa-865a-689de3763c59?#
password ixgg
References
@inproceedings{inproceedings, author = {Bergmann, Paul and Lรถwe, Sindy and Fauser, Michael and Sattlegger, David and Steger, Carsten}, year = {2019}, month = {01}, pages = {372-380}, title = {Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders}, doi = {10.5220/0007364503720380} }
Paul Bergmann, Michael Fauser, David Sattlegger, Carsten Steger. MVTec AD - A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection; in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019