• Stars
    star
    148
  • Rank 249,983 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

License Framework

This repository contains the official implementation of our paper:

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
Nikita Araslanov and Stefan Roth
CVPR 2021. [pdf] [supp] [arXiv]

drawing

We obtain state-of-the-art accuracy of adapting semantic
segmentation by enforcing consistency across photometric
and similarity transformations. We use neither style transfer
nor adversarial training.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de


Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least two Titan X GPUs (12Gb) or equivalent are required for VGG-16; ResNet-101 and VGG-16/FCN need four.

  1. create conda environment:
conda create --name da-sac
source activate da-sac
  1. install PyTorch >=1.4 (see PyTorch instructions). For example,
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  1. install the dependencies:
pip install -r requirements.txt
  1. download data (Cityscapes, GTA5, SYNTHIA) and create symlinks in the ./data folder, as follows:
./data/cityscapes -> <symlink to Cityscapes>
./data/cityscapes/gtFine2/
./data/cityscapes/leftImg8bit/

./data/game -> <symlink to GTA>
./data/game/labels_cs
./data/game/images

./data/synthia  -> <symlink to SYNTHIA>
./data/synthia/labels_cs
./data/synthia/RGB

Note that all ground-truth label IDs (Cityscapes, GTA5 and SYNTHIA) should be converted to Cityscapes train IDs. The label directories in the above example (gtFine2, labels_cs) therefore refer not to the original labels, but to these converted semantic maps.

Training

Training from ImageNet initialisation proceeds in three steps:

  1. Training the baseline (ABN)
  2. Generating the weights for importance sampling
  3. Training with augmentation consistency from the ABN baseline

1. Training the baseline (ABN)

Here the input are ImageNet models available from the official PyTorch repository. We provide the links to those models for convenience.

Backbone Link
ResNet-101 resnet101-5d3b4d8f.pth (171M)
VGG-16 vgg16_bn-6c64b313.pth (528M)

By default, these models should be placed in ./models/pretrained/ (though configurable with MODEL.INIT_MODEL).

To run the training

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn] base

where the first argument specifies the source domain, the second determines the network architecture. The third argument base instructs to run the training of the baseline.

If you would like to skip this step, you can use our pre-trained models:

Source domain: GTA5

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 40.8 baseline_abn_e040.pth (336M) 9fe17[...]c11fc
VGG-16 DeepLabv2 37.1 baseline_abn_e115.pth (226M) d4ffc[...]ef755
VGG-16 FCN 36.7 baseline_abn_e040.pth (1.1G) aa2e9[...]bae53

Source domain: SYNTHIA

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 36.3 baseline_abn_e090.pth (336M) b3431[...]d1a83
VGG-16 DeepLabv2 34.4 baseline_abn_e070.pth (226M) 3af24[...]5b24e
VGG-16 FCN 31.6 baseline_abn_e040.pth (1.1G) 5f457[...]e4b3a

Tip: You can download these files (as well as the final models below) with tools/download_baselines.sh:

cp tools/download_baselines.sh snapshots/cityscapes/baselines/
cd snapshots/cityscapes/baselines/
bash ./download_baselines.sh

2. Generating weights for importance sampling

To generate the weights you need to

  1. generate mask predictions with your baseline (see inference below);
  2. run tools/compute_image_weights.py that reads in those predictions and counts the predictions per each class.

If you would like to skip this step, you can use our weights we computed for the ABN baselines above:

Backbone Arch. Source: GTA5 Source: SYNTHIA
ResNet-101 DeepLabv2 cs_weights_resnet101_gta.data cs_weights_resnet101_synthia.data
VGG-16 DeepLabv2 cs_weights_vgg16_gta.data cs_weights_vgg16_synthia.data
VGG-16 FCN cs_weights_vgg16fcn_gta.data cs_weights_vgg16fcn_synthia.data

Tip: The bash script data/download_weights.sh will download all these importance sampling weights in the current directory.

3. Training with augmentation consistency

To train the model with augmentation consistency, we use the same shell script as in step 1, but without the argument base:

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn]

Make sure to specify your baseline snapshot with RESUME bash variable set in the environment (export RESUME=...) or directly in the shell script (commented out by default).

We provide our final models for download.

Source domain: GTA5

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 53.8 55.7 final_e136.pth (504M) 59c16[...]5a32f
VGG-16 DeepLabv2 49.8 51.0 final_e184.pth (339M) 0accb[...]d5881
VGG-16 FCN 49.9 50.4 final_e112.pth (1.6G) e69f8[...]f729b

Source domain: SYNTHIA

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 52.6 52.7 final_e164.pth (504M) a7682[...]db742
VGG-16 DeepLabv2 49.1 48.3 final_e164.pth (339M) c5b31[...]5fdb7
VGG-16 FCN 46.8 45.8 final_e098.pth (1.6G) efb74[...]845cc

Inference and evaluation

Inference

To run single-scale inference from your snapshot, use infer_val.py. The bash script launch/infer_val.sh provides an easy way to run the inference by specifying a few variables:

# validation/training set
FILELIST=[val_cityscapes|train_cityscapes] 
# configuration used for training
CONFIG=configs/[deeplabv2_vgg16|deeplab_resnet101|fcn_vgg16]_train.yaml
# the following 3 variables effectively specify the path to the snapshot
EXP=...
RUN_ID=...
SNAPSHOT=...
# the snapshot path is defined as
# SNAPSHOT_PATH=snapshots/cityscapes/${EXP}/${RUN_ID}/${SNAPSHOT}.pth

Evaluation

Please use the Cityscapes' official evaluation tool evalPixelLevelSemanticLabeling from Cityscapes scripts for evaluating your results.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DASAC,
    author    = {Araslanov, Nikita and Roth, Stefan},
    title     = {Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {15384-15394}
}

More Repositories

1

1-stage-wseg

Single-Stage Semantic Segmentation from Image Labels (CVPR 2020)
Python
379
star
2

n3net

Neural Nearest Neighbors Networks (NIPS*2018)
Python
284
star
3

self-mono-sf

Self-Supervised Monocular Scene Flow Estimation (CVPR 2020)
Python
248
star
4

irr

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation (CVPR 2019)
Python
192
star
5

dense-ulearn-vos

Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)
Python
181
star
6

dpp

Detail-Preserving Pooling in Deep Networks (CVPR 2018)
Cuda
115
star
7

multi-mono-sf

Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)
Python
99
star
8

ppac_refinement

Probabilistic Pixel-Adaptive Refinement Networks (CVPR 2020)
Python
77
star
9

cos-cvae

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
Jupyter Notebook
37
star
10

lnfmm

Latent Normalizing Flows for Many-to-Many Cross Domain Mappings (ICLR 2020)
Python
33
star
11

adapter_plus

[CVPR 2024] Official implementation of "Adapters Strike Back"
Python
29
star
12

cad

Content-Adaptive Downsampling in Convolutional Neural Networks (CVPR 2023 Workshop on Efficient Deep Learning for Computer Vision)
Python
23
star
13

veto

Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)
Jupyter Notebook
21
star
14

funnybirds

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
JavaScript
19
star
15

acis

Actor-Critic Instance Segmentation (CVPR 2019)
Lua
19
star
16

deblur-devil

Deep Video Deblurring: The Devil is in the Details (ICCV Workshop 2019)
Python
17
star
17

self-adaptive

Semantic Self-adaptation: Enhancing Generalization with a Single Sample
Python
17
star
18

fast-axiomatic-attribution

Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)
Jupyter Notebook
15
star
19

mar-scf

Normalizing Flows with Multi-Scale Autoregressive Priors (CVPR 2020)
Python
15
star
20

funnybirds-framework

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
Python
13
star
21

fldr-vfi

Efficient Feature Extraction for High-resolution Video Frame Interpolation (BMVC 2022)
Python
11
star
22

primaps

11
star
23

pixelpyramids

PixelPyramids: Exact Inference Models from Lossless Image Pyramids (ICCV 2021)
Python
10
star
24

s2-flow

S2-Flow: Joint Semantic and Style Editing of Facial Images (BMVC 2022)
Python
7
star
25

style-seqcvae

Diverse Image Captioning with Grounded Style (GCPR 2021)
Python
6
star
26

semantic_lattice

Semantic Lattice (GCPR 2019)
Python
5
star
27

jwae

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings (ICCV 2019 Workshop on Cross-Modal Learning in Real World)
Python
5
star
28

probflow

ProbFlow: Joint Optical Flow and Uncertainty Estimation (ICCV 2017)
MATLAB
4
star
29

DIAGen

DIAGen: Semantically Diverse Image Augmentation with Generative Models for Few-Shot Learning (GCPR 2024)
Python
4
star
30

benchmarking-synthetic-clones

Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images (CVPRW 2024)
3
star
31

svigl

Stochastic Variational Inference with Gradient Linearization (CVPR 2018)
MATLAB
2
star
32

playing-for-data

Playing for data: Ground Truth from Computer Games (ECCV 2016)
C++
2
star
33

mirrorflow

MirrorFlow: Exploiting Symmetries in Joint Optical Flow and Occlusion Estimation (ICCV 2017)
C++
1
star
34

matryoshka

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
Python
1
star