• Stars
    star
    196
  • Rank 198,553 (Top 4 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2023] Official repository of Generative Semantic Segmentation

Generative Semantic Segmentation

Paper

Generative Semantic Segmentation,
Jiaqi Chen, Jiachen Lu, Xiatian Zhu, and Li Zhang
CVPR 2023

Abstract

We present Generative Semantic Segmentation (GSS), a generative framework for semantic segmentation. Unlike previous methods addressing a per-pixel classification problem, we cast semantic segmentation into an image-conditioned mask generation problem. This is achieved by replacing the conventional per-pixel discriminative learning with a latent prior learning process. Specifically, we model the variational posterior distribution of latent variables given the segmentation mask. This is done by expressing the segmentation mask with a special type of image (dubbed as maskige). This posterior distribution allows to generate segmentation masks unconditionally. To implement semantic segmentation, we further introduce a conditioning network (e.g., an encoder-decoder Transformer) optimized by minimizing the divergence between the posterior distribution of maskige (i.e. segmentation masks) and the latent prior distribution of input images on the training set. Extensive experiments on standard benchmarks show that our GSS can perform competitively to prior art alternatives in the standard semantic segmentation setting, whilst achieving a new state of the art in the more challenging cross-domain setting.

GSS

Results

Cityscapes dataset

Name Backbone Iterations mIoU mAcc Config checkpoint
GSS-FF R101 80k 77.76 85.9 config google drive
GSS-FF Swin-L 80k 78.90 87.03 config google drive
GSS-FT-W ResNet 80k 78.46 85.92 config google drive
GSS-FT-W Swin-L 80k 80.05 87.32 config google drive

ADE20K dataset

Name Backbone Iterations mIoU mAcc Config checkpoint
GSS-FF Swin-L 160k 46.29 57.84 config google drive
GSS-FT-W Swin-L 160k 48.54 58.94 config google drive

MSeg dataset

Name Backbone Iterations h.mean Config checkpoint
GSS-FF HRNet-W48 160k 52.60 config google drive
GSS-FF Swin-L 160k 59.49 config google drive
GSS-FT-W HRNet-W48 160k 55.20 config google drive
GSS-FT-W Swin-L 160k 61.94 config google drive

Get Started

Intall

This implementation is build upon mmsegmentation. please follow the steps in INSTALL.md to prepare the environment.

Data

Please follow the steps in DATA.md to prepare the dataset.

Train

The training process is divided into three stages:

  1. latent posterior learning of $\mathcal{X}$;
  2. latent prior learning (Train GSS-FF);
  3. latent posterior learning of $\mathcal{X}^{-1}$ (Train GSS-FT-W).

See TRAIN.md for more information.

Eval

Please download the pre-trained model weights and put them in the ./<ckp_dir> folder. We provide the following scripts to evaluate GSS.

bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} --eval mIoU

For example, to evaluate the GSS-FF model on Cityscapes dataset, run:

# test with 8 GPUs
bash tools/dist_test.sh configs/gss/cityscapes/gss-ff_r101_768x768_80k_cityscapes.py ./<ckp_dir>/gss-ff_swin-l_768x768_80k_cityscapes_iter_80000.pth 8 --eval mIoU

Reference

@inproceedings{chen2023generative,
  title={Generative Semantic Segmentation
  author={Chen, Jiaqi and Lu, Jiachen and Zhu, Xiatian and Zhang, Li},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}
}

More Repositories

1

Semantic-Segment-Anything

Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Python
2,073
star
2

SETR

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Python
1,033
star
3

4d-gaussian-splatting

[ICLR 2024] Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
Python
538
star
4

SOFT

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity
Python
302
star
5

SeaFormer

[ICLR 2023] SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Python
285
star
6

PVG

Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering
Python
257
star
7

DeepInteraction

[NeurIPS 2022] DeepInteraction: 3D Object Detection via Modality Interaction
Python
201
star
8

S-NeRF

[ICLR 2023] S-NeRF: Neural Radiance Fields for Street Views
Python
165
star
9

PolarFormer

[AAAI 2023] PolarFormer: Multi-camera 3D Object Detection with Polar Transformers
Python
161
star
10

tet-splatting

[NeurIPS 2024] Tetrahedron Splatting for 3D Generation
107
star
11

Ego3RT

[ECCV 2022] Learning Ego 3D Representation as Ray Tracing
Python
105
star
12

WoVoGen

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
Python
78
star
13

Efficient4D

Python
74
star
14

PGC-3D

[ICLR 2024] Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping
Python
73
star
15

meta-prompts

Python
67
star
16

Reason2Drive

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
64
star
17

RoadNet

[ICCV2023 Oral] RoadNetworkTRansformer & [AAAI 2024] LaneGraph2Seq
Python
63
star
18

NeRF-LiDAR

[AAAI 2024] NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields
Python
62
star
19

PDS

[ECCV 2022] Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling
Python
54
star
20

DGMN2

[TPAMI 2022 & CVPR 2020 Oral] Dynamic Graph Message Passing Networks
Python
29
star
21

diffusion-square

Python
29
star
22

DDMP

[CVPR 2021] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
Python
24
star
23

TDAS

18
star
24

S-Agents

Official repository of S-Agents: Self-organizing Agents in Open-ended Environment
16
star
25

Rodyn-SLAM

15
star
26

PARTNER

[ICCV 2023] PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
Python
11
star
27

fudan-zvg.github.io

JavaScript
4
star
28

Brain3D

2
star
29

DGMN2_MindSpore_Ascend

Python
1
star