• Stars
    star
    1,033
  • Rank 44,608 (Top 0.9 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

SEgmentation TRansformers -- SETR

SETR

image

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers, Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip HS Torr, Li Zhang, CVPR 2021

Vision Transformers: From Semantic Segmentation to Dense Prediction, Li Zhang, Jiachen Lu, Sixiao Zheng, Xinxuan Zhao, Xiatian Zhu, Yanwei Fu, Tao Xiang, Jianfeng Feng

SETR

Cityscapes

Method Crop Size Batch size iteration set mIoU model config
SETR-Naive 768x768 8 40k val 77.37 google drive config
SETR-Naive 768x768 8 80k val 77.90 google drive config
SETR-MLA 768x768 8 40k val 76.65 google drive config
SETR-MLA 768x768 8 80k val 77.24 google drive config
SETR-PUP 768x768 8 40k val 78.39 google drive config
SETR-PUP 768x768 8 80k val 79.34 google drive config
SETR-Naive-Base 768x768 8 40k val 75.54 google drive config
SETR-Naive-Base 768x768 8 80k val 76.25 google drive config
SETR-Naive-DeiT 768x768 8 40k val 77.85 google drive config
SETR-Naive-DeiT 768x768 8 80k val 78.66 google drive config
SETR-MLA-DeiT 768x768 8 40k val 78.04 google drive config
SETR-MLA-DeiT 768x768 8 80k val 78.98 google drive config
SETR-PUP-DeiT 768x768 8 40k val 78.79 google drive config
SETR-PUP-DeiT 768x768 8 80k val 79.45 google drive config

ADE20K

Method Crop Size Batch size iteration set mIoU mIoU(ms+flip) model Config
SETR-Naive 512x512 16 160k Val 48.06 48.80 google drive config
SETR-MLA 512x512 8 160k val 47.79 50.03 google drive config
SETR-MLA 512x512 16 160k val 48.64 50.28 google drive config
SETR-MLA-Deit 512x512 16 160k val 46.15 47.71 google drive config
SETR-PUP 512x512 16 160k val 48.62 50.09 google drive config
SETR-PUP-Deit 512x512 16 160k val 46.34 47.30 google drive config

Pascal Context

Method Crop Size Batch size iteration set mIoU mIoU(ms+flip) model Config
SETR-Naive 480x480 16 80k val 52.89 53.61 google drive config
SETR-MLA 480x480 8 80k val 54.39 55.39 google drive config
SETR-MLA 480x480 16 80k val 55.01 55.83 google drive config
SETR-MLA-DeiT 480x480 16 80k val 52.91 53.74 google drive config
SETR-PUP 480x480 16 80k val 54.37 55.27 google drive config
SETR-PUP-DeiT 480x480 16 80k val 52.00 52.50 google drive config

HLG

ImageNet-1K

HLG classification is under folder hlg-classification/.

Model Resolution Params FLOPs Top-1 % Config Pretrained Model
HLG-Tiny 224 11M 2.1G 81.1 hlg_tiny_224.yaml google drive
HLG-Small 224 24M 4.7G 82.3 hlg_small_224.yaml google drive
HLG-Medium 224 43M 9.0G 83.6 hlg_medium_224.yaml google drive
HLG-Large 224 84M 15.9G 84.1 hlg_large_224.yaml google drive

Cityscapes

HLG segmentation shares the same folder as SETR.

Method Crop Size Batch size iteration set mIoU config
SETR-HLG-Small 768x768 16 40k val 81.8 config
SETR-HLG-Medium 768x768 16 40k val 82.5 config
SETR-HLG-Large 768x768 16 40k val 82.9 config

ADE20K

HLG segmentation shares the same folder as SETR.

Method Crop Size Batch size iteration set mIoU Config
SETR-HLG-Small 512x512 16 160k Val 47.3 config
SETR-HLG-Medium 512x512 16 160k Val 49.3 config
SETR-HLG-Large 512x512 16 160k Val 49.8 config

COCO

HLG detection is under folder hlg-detection/.

Backbone Lr schd box AP Config
SETR-HLG-Small 1x 44.4 config
SETR-HLG-Medium 1x 46.6 config
SETR-HLG-Large 1x 47.7 config

Installation

Our project is developed based on MMsegmentation. Please follow the official MMsegmentation INSTALL.md and getting_started.md for installation and dataset preparation.

🔥🔥 SETR is on MMsegmentation. 🔥🔥

A from-scratch setup script

Linux

Here is a full script for setting up SETR with conda and link the dataset path (supposing that your dataset path is $DATA_ROOT).

conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab

conda install pytorch=1.6.0 torchvision cudatoolkit=10.1 -c pytorch -y
pip install mmcv-full==1.2.2 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
git clone https://github.com/fudan-zvg/SETR.git
cd SETR
pip install -e .  # or "python setup.py develop"
pip install -r requirements/optional.txt

mkdir data
ln -s $DATA_ROOT data

Windows(Experimental)

Here is a full script for setting up SETR with conda and link the dataset path (supposing that your dataset path is %DATA_ROOT%. Notice: It must be an absolute path).

conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab

conda install pytorch=1.6.0 torchvision cudatoolkit=10.1 -c pytorch
set PATH=full\path\to\your\cpp\compiler;%PATH%
pip install mmcv

git clone https://github.com/fudan-zvg/SETR.git
cd SETR
pip install -e .  # or "python setup.py develop"
pip install -r requirements/optional.txt

mklink /D data %DATA_ROOT%

Get Started

Pre-trained model

The pre-trained model will be automatically downloaded and placed in a suitable location when you run the training command. If you are unable to download due to network reasons, you can download the pre-trained model from here (ViT) and here (DeiT).

Train

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} 
# For example, train a SETR-PUP on Cityscapes dataset with 8 GPUs
./tools/dist_train.sh configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py 8
  • Tensorboard

    If you want to use tensorboard, you need to pip install tensorboard and uncomment the Line 6 dict(type='TensorboardLoggerHook') of SETR/configs/_base_/default_runtime.py.

Single-scale testing

./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM}  [--eval ${EVAL_METRICS}]
# For example, test a SETR-PUP on Cityscapes dataset with 8 GPUs
./tools/dist_test.sh configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py \
work_dirs/SETR_PUP_768x768_40k_cityscapes_bs_8/iter_40000.pth \
8 --eval mIoU

Multi-scale testing

Use the config file ending in _MS.py in configs/SETR.

./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM}  [--eval ${EVAL_METRICS}]
# For example, test a SETR-PUP on Cityscapes dataset with 8 GPUs
./tools/dist_test.sh configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8_MS.py \
work_dirs/SETR_PUP_768x768_40k_cityscapes_bs_8/iter_40000.pth \
8 --eval mIoU

Generate the png files to be submit to the official evaluation server

  • Cityscapes

    First, add following to config file configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py,

    data = dict(
        test=dict(
            img_dir='leftImg8bit/test',
            ann_dir='gtFine/test'))

    Then run test

    ./tools/dist_test.sh configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py \
        work_dirs/SETR_PUP_768x768_40k_cityscapes_bs_8/iter_40000.pth \
        8 --format-only --eval-options "imgfile_prefix=./SETR_PUP_768x768_40k_cityscapes_bs_8_test_results"

    You will get png files under directory ./SETR_PUP_768x768_40k_cityscapes_bs_8_test_results. Run zip -r SETR_PUP_768x768_40k_cityscapes_bs_8_test_results.zip SETR_PUP_768x768_40k_cityscapes_bs_8_test_results/ and submit the zip file to evaluation server.

  • ADE20k

    ADE20k dataset could be download from this link

    First, add following to config file configs/SETR/SETR_PUP_512x512_160k_ade20k_bs_16.py,

    data = dict(
        test=dict(
            img_dir='images/testing',
            ann_dir='annotations/testing'))

    Then run test

    ./tools/dist_test.sh configs/SETR/SETR_PUP_512x512_160k_ade20k_bs_16.py \
        work_dirs/SETR_PUP_512x512_160k_ade20k_bs_16/iter_1600000.pth \
        8 --format-only --eval-options "imgfile_prefix=./SETR_PUP_512x512_160k_ade20k_bs_16_test_results"

    You will get png files under ./SETR_PUP_512x512_160k_ade20k_bs_16_test_results directory. Run zip -r SETR_PUP_512x512_160k_ade20k_bs_16_test_results.zip SETR_PUP_512x512_160k_ade20k_bs_16_test_results/ and submit the zip file to evaluation server.

Please see getting_started.md for the more basic usage of training and testing.

Reference

@inproceedings{SETR,
    title={Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers}, 
    author={Zheng, Sixiao and Lu, Jiachen and Zhao, Hengshuang and Zhu, Xiatian and Luo, Zekun and Wang, Yabiao and Fu, Yanwei and Feng, Jianfeng and Xiang, Tao and Torr, Philip H.S. and Zhang, Li},
    booktitle={CVPR},
    year={2021}
}
@article{SETR-HLG,
    title={Vision Transformers: From Semantic Segmentation to Dense Prediction}, 
    author={Zhang, Li and Lu, Jiachen and Zheng, Sixia and Zhao, Xinxuan and Zhu, Xiatian and Fu, Yanwei and Xiang Tao and Feng, Jianfeng},
    journal={arXiv},
    year={2023}
}

License

MIT

Acknowledgement

Thanks to previous open-sourced repo: MMsegmentation pytorch-image-models

More Repositories

1

Semantic-Segment-Anything

Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Python
2,073
star
2

4d-gaussian-splatting

[ICLR 2024] Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
Python
538
star
3

SOFT

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity
Python
302
star
4

SeaFormer

[ICLR 2023] SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Python
285
star
5

PVG

Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering
Python
257
star
6

DeepInteraction

[NeurIPS 2022] DeepInteraction: 3D Object Detection via Modality Interaction
Python
201
star
7

GSS

[CVPR 2023] Official repository of Generative Semantic Segmentation
Python
196
star
8

S-NeRF

[ICLR 2023] S-NeRF: Neural Radiance Fields for Street Views
Python
165
star
9

PolarFormer

[AAAI 2023] PolarFormer: Multi-camera 3D Object Detection with Polar Transformers
Python
161
star
10

tet-splatting

[NeurIPS 2024] Tetrahedron Splatting for 3D Generation
107
star
11

Ego3RT

[ECCV 2022] Learning Ego 3D Representation as Ray Tracing
Python
105
star
12

WoVoGen

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
Python
78
star
13

Efficient4D

Python
74
star
14

PGC-3D

[ICLR 2024] Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping
Python
73
star
15

meta-prompts

Python
67
star
16

Reason2Drive

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
64
star
17

RoadNet

[ICCV2023 Oral] RoadNetworkTRansformer & [AAAI 2024] LaneGraph2Seq
Python
63
star
18

NeRF-LiDAR

[AAAI 2024] NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields
Python
62
star
19

PDS

[ECCV 2022] Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling
Python
54
star
20

DGMN2

[TPAMI 2022 & CVPR 2020 Oral] Dynamic Graph Message Passing Networks
Python
29
star
21

diffusion-square

Python
29
star
22

DDMP

[CVPR 2021] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
Python
24
star
23

TDAS

18
star
24

S-Agents

Official repository of S-Agents: Self-organizing Agents in Open-ended Environment
16
star
25

Rodyn-SLAM

15
star
26

PARTNER

[ICCV 2023] PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
Python
11
star
27

fudan-zvg.github.io

JavaScript
4
star
28

Brain3D

2
star
29

DGMN2_MindSpore_Ascend

Python
1
star