• Stars
    star
    152
  • Rank 244,685 (Top 5 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICCV2023] DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models

DiffuMask (ICCV 2023)

DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models


๐Ÿ› ๏ธ Getting Started with DiffuMask

Conda env installation

conda create -n DiffuMask python=3.8

conda activate DiffuMask
 install pydensecrf https://github.com/lucasb-eyer/pydensecrf
pip install git+https://github.com/lucasb-eyer/pydensecrf.git

pip install -r requirements.txt
If there is an error: 

bug for cannot import name 'autocast' from 'torch', 

please refer to the website:  

https://github.com/pesser/stable-diffusion/issues/14

1. Data and mask generation

# generating data and attention map witn stable diffusion (Before generating the data, you need to modify the "hunggingface key" in the "VOC_data_generation.sh" script to your own key. )
sh ./script/DiffusionGeneration/VOC_data_generation.sh

2. Refine Mask with AffinityNet (Coarse Mask)

We also offer the AffinityNet weight for the 'dog' class on Google Drive and 'bird' class on Google drive.

# prepare training data for affinity net
sh ./script/prepare_aff_data.sh

# train affinity net
Before training, you need to download the ResNet-38 ImageNet pre-trained weights and place them in the "./pretrained_model" directory
sh ./script/train_affinity.sh

# inference affinity net
sh ./script/infer_aff.sh

# generate accurate pseudo label with CRF
sh ./script/curve_threshold.sh

3. Noise Learning (Cross Validation)

At this stage, it is necessary to train Mask2Former using cross-validation to filter out noisy data. Before training Mask2Former, data augmentation needs to be performed on the dataset.

sh ./script/augmentation_VOC.sh

To start training the model, please note the following points:

  • In this repository, only data for one class (airplanes) is generated. During the training process, it is recommended to include other categories as negative samples. You can generate data for other categories or directly use other categories from the VOC dataset. The corresponding masks are not required, and the labels for other categories can be set to 0.

  • During the training process, augmented data (concatenated images) with the original data (original images) all should be used.

4. Training Mask2former with clear data

We are providing synthetic data for the "dog" category here with Baidu Drive (password: 53rb) and "Bird" category with Baidu Drive (password: 8v7q). Feel free to use it.

Citation

@article{wu2023diffumask,
  title={Diffumask: Synthesizing images with pixel-level annotations for semantic segmentation using diffusion models},
  author={Wu, Weijia and Zhao, Yuzhong and Shou, Mike Zheng and Zhou, Hong and Shen, Chunhua},
  journal={Proc. Int. Conf. Computer Vision (ICCV 2023)},
  year={2023}
}

More Repositories

1

TransDETR

[IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer
Python
102
star
2

TransVTSpotter

A new video text spotting framework with Transformer
Python
72
star
3

BOVText-Benchmark

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting
Python
55
star
4

Awesome-Synthetic-Data-for-Perception-Task

34
star
5

Polygon-free-Unconstrained-Scene-Text-Detection-with-Box-Annotations

Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training
Python
34
star
6

SyntoReal_STD

HHH
Python
31
star
7

information-extraction-of-express

ๅŸบไบŽOpenCVใ€็›ฎๆ ‡ๆฃ€ๆต‹ๅฏนๅฟซ้€’ๅ•่ฟ›่กŒ่ฏ†ๅˆซๅนถๆๅ–ๆœ‰็”จ็š„ไฟกๆฏ่ฟ›่กŒๅค„็†
Python
12
star
8

Classification_with_Cleanlab

Classification_with_Cleanlab is an open source image classification toolbox using Cleanlab based on PyTorch. The toolbox efficiently utilize Cleanlan to learn with noisy labels and finding label errors in datasets.
Python
9
star
9

Yibao-cup_competition

oo
Python
8
star
10

Domain-Adaptive-Collection

5
star
11

Blemish_Detection_of_aluminium

Jupyter Notebook
4
star
12

TextVR

TextVR:A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension. and StarVR: secne text aware video retrieval
4
star
13

My_Note

3
star
14

Scene-Text-Detection-and-Recognition-benchmark

C++
3
star
15

Scene_Text_Paper_Newest

It is a personal summary concerning newest scene text paper in 2019
3
star
16

DiffusionMask

HTML
3
star
17

Pytorch_Classification

weijiawu/Pytorch_Classification
Python
2
star
18

kaggle_driver_state_detection

Jupyter Notebook
2
star
19

Tensorflow_DL

Jupyter Notebook
2
star
20

TextAsLine

A weakly-supervised text detector named 'Texts As Lines' and a set of training methods, which greatly simplify the annotation process and improve the speed without losing accuracy.
2
star
21

CoText

Python
1
star
22

Segment-Anything-for-Stable-Diffusion-WebUI

Python
1
star
23

weijiawu.github.io

HTML
1
star