FreeMask
This codebase provides the official PyTorch implementation of our NeurIPS 2023 paper:
FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
Lihe Yang, Xiaogang Xu, Bingyi Kang, Yinghuan Shi, Hengshuang Zhao
In Conference on Neural Information Processing Systems (NeurIPS), 2023
[Paper
] [Datasets
] [Models
] [Logs
] [BibTeX
]
TL;DR
We generate diverse synthetic images from semantic masks, and use these synthetic pairs to boost the fully-supervised semantic segmentation performance.
Results
ADE20K
Model | Backbone | Real Images | + Synthetic Images | Gain ( |
Download |
---|---|---|---|---|---|
Mask2Former | Swin-T | 48.7 | 52.0 | +3.3 | ckpt | log |
Mask2Former | Swin-S | 51.6 | 53.3 | +1.7 | ckpt | log |
Mask2Former | Swin-B | 52.4 | 53.7 | +1.3 | ckpt | log |
SegFormer | MiT-B2 | 45.6 | 47.9 | +2.3 | ckpt | log |
SegFormer | MiT-B4 | 48.5 | 50.6 | +2.1 | ckpt | log |
Segmenter | ViT-S | 46.2 | 47.9 | +1.7 | ckpt | log |
Segmenter | ViT-B | 49.6 | 51.1 | +1.5 | ckpt | log |
COCO-Stuff-164K
Model | Backbone | Real Images | + Synthetic Images | Gain ( |
Download |
---|---|---|---|---|---|
Mask2Former | Swin-T | 44.5 | 46.4 | +1.9 | ckpt | log |
Mask2Former | Swin-S | 46.8 | 47.6 | +0.8 | ckpt | log |
SegFormer | MiT-B2 | 43.5 | 44.2 | +0.7 | ckpt | log |
SegFormer | MiT-B4 | 45.8 | 46.6 | +0.8 | ckpt | log |
Segmenter | ViT-S | 43.5 | 44.8 | +1.3 | ckpt | log |
Segmenter | ViT-B | 46.0 | 47.5 | +1.5 | ckpt | log |
High-Quality Synthetic Datasets
We share our already processed synthetic ADE20K and COCO-Stuff-164K datasets below. The ADE20K-Synthetic dataset is 20x larger than its real counterpart, while the COCO-Synthetic is 6x larger than its real counterpart.
Getting Started
Installation
Install MMSegmentation:
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0rc4"
Download Real Datasets
Follow the instructions to download the ADE20K and COCO-Stuff-164K real datasets. The COCO annotations need to be pre-processed following the instructions.
Download Synthetic Datasets
Please see above.
Note:
- Please modify the dataset path
data_root
(real images) anddata_root_syn
(synthetic images) in config files. - If you use SegFormer, please convert the pre-trained MiT backbones following this, and put
mit_b2.pth
,mit_b4.pth
underpretrain
directory.
Usage
bash dist_train.sh <config> 8
Generate and Pre-process Synthetic Images (Optional)
We have provided the processed synthetic images above. You can directly use them to train a stronger segmentation model. However, if you want to generate additional images by yourself, we introduce the generation and pre-processing steps below.
Generate Synthetic Images
We strictly follow FreestyleNet for initial image generation. Please refer to their instructions. You can change the random seed to produce multiple synthetic images from a semantic mask.
Pre-process Synthetic Images
Our work focuses on this part.
Filter out Noisy Synthetic Regions
python preprocess/filter.py <config> <checkpoint> --real-img-path <> --real-mask-path <> --syn-img-path <> --syn-mask-path <> --filtered-mask-path <>
We use the pre-trained SegFormer-B4 model to calculate class-wise mean loss on real images and then filter out noisy synthetic regions.
Re-sample Synthetic Images based on Mask-level Hardness
python preprocess/resample.py --real-mask-path <> --syn-img-path <> --syn-mask-path <> --resampled-syn-img-path <> --resampled-syn-mask-path <>
Acknowledgment
We thank FreestyleNet for providing their mask-to-image synthesis models.
Citation
If you find this project useful, please consider citing:
@inproceedings{freemask,
title={FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models},
author={Yang, Lihe and Xu, Xiaogang and Kang, Bingyi and Shi, Yinghuan and Zhao, Hengshuang},
booktitle={NeurIPS},
year={2023}
}