SUNet: Swin Transformer with UNet for Image Denoising
[ISCAS 2022]Chi-Mao Fan, Tsung-Jung Liu, Kuan-Hsien Liu
Abstract : Image restoration is a challenging ill-posed problem which also has been a long-standing issue. In the past few years, the convolution neural networks (CNNs) almost dominated the computer vision and had achieved considerable success in different levels of vision tasks including image restoration. However, recently the Swin Transformer-based model also shows impressive performance, even surpasses the CNN-based methods to become the state-of-the-art on high-level vision tasks. In this paper, we proposed a restoration model called SUNet which uses the Swin Transformer layer as our basic block and then is applied to UNet architecture for image denoising.
Network Architecture
Overall Framework of SUNet |
|
Swin Transformer Layer |
Dual up-sample |
Quick Run
You can directly run personal noised images on my space of HuggingFce.
To test the pre-trained models of denoising on your own 256x256 images, run
python demo.py --input_dir images_folder_path --result_dir save_images_here --weights path_to_models
Here is an example command:
python demo.py --input_dir './demo_samples/' --result_dir './demo_results' --weights './pretrained_model/denoising_model.pth'
To test the pre-trained models of denoising on your arbitrary resolution images, run
python demo_any_resolution.py --input_dir images_folder_path --stride shifted_window_stride --result_dir save_images_here --weights path_to_models
SUNset could only handle the fixed size input which the resolution in training phase same as the mostly transformer-based methods because of the attention masks are fixed. If we want to denoise the arbitrary resolution input, the shifted-window method will be applied to avoid border effect. The code of demo_any_resolution.py
is supported to fix the problem.
Train
To train the restoration models of Denoising. You should check the following components:
-
training.yaml
:# Training configuration GPU: [0,1,2,3] VERBOSE: False SWINUNET: IMG_SIZE: 256 PATCH_SIZE: 4 WIN_SIZE: 8 EMB_DIM: 96 DEPTH_EN: [8, 8, 8, 8] HEAD_NUM: [8, 8, 8, 8] MLP_RATIO: 4.0 QKV_BIAS: True QK_SCALE: 8 DROP_RATE: 0. ATTN_DROP_RATE: 0. DROP_PATH_RATE: 0.1 APE: False PATCH_NORM: True USE_CHECKPOINTS: False FINAL_UPSAMPLE: 'Dual up-sample' MODEL: MODE: 'Denoising' # Optimization arguments. OPTIM: BATCH: 4 EPOCHS: 500 # EPOCH_DECAY: [10] LR_INITIAL: 2e-4 LR_MIN: 1e-6 # BETA1: 0.9 TRAINING: VAL_AFTER_EVERY: 1 RESUME: False TRAIN_PS: 256 VAL_PS: 256 TRAIN_DIR: './datasets/Denoising_DIV2K/train' # path to training data VAL_DIR: './datasets/Denoising_DIV2K/test' # path to validation data SAVE_DIR: './checkpoints' # path to save models and images
-
Dataset:
The preparation of dataset in more detail, see datasets/README.md. -
Train:
If the above path and data are all correctly setting, just simply run:python train.py
Result
Visual Comparison
Citation
If you use SUNet, please consider citing:
@inproceedings{fan2022sunet,
title={SUNet: swin transformer UNet for image denoising},
author={Fan, Chi-Mao and Liu, Tsung-Jung and Liu, Kuan-Hsien},
booktitle={2022 IEEE International Symposium on Circuits and Systems (ISCAS)},
pages={2333--2337},
year={2022},
organization={IEEE}
}