• Stars
    star
    379
  • Rank 113,004 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Single-Stage Semantic Segmentation from Image Labels (CVPR 2020)

Single-Stage Semantic Segmentation from Image Labels

License Framework

This repository contains the original implementation of our paper:

Single-stage Semantic Segmentation from Image Labels
Nikita Araslanov and Stefan Roth
CVPR 2020. [pdf] [supp] [arXiv]

Contact: Nikita Araslanov [email protected]

drawing
We attain competitive results by training a single network model
for segmentation in a self-supervised fashion using only
image-level annotations (one run of 20 epochs on Pascal VOC).

Setup

  1. Minimum requirements. This project was originally developed with Python 3.6, PyTorch 1.0 and CUDA 9.0. The training requires at least two Titan X GPUs (12Gb memory each).

  2. Setup your Python environment. Please, clone the repository and install the dependencies. We recommend using Anaconda 3 distribution:

    conda create -n <environment_name> --file requirements.txt
    
  3. Download and link to the dataset. We train our model on the original Pascal VOC 2012 augmented with the SBD data (10K images in total). Download the data from:

    Link to the data:

    ln -s <your_path_to_voc> <project>/data/voc
    ln -s <your_path_to_sbd> <project>/data/sbd
    

    Make sure that the first directory in data/voc is VOCdevkit; the first directory in data/sbd is benchmark_RELEASE.

  4. Download pre-trained models. Download the initial weights (pre-trained on ImageNet) for the backbones you are planning to use and place them into <project>/models/weights/.

    Backbone Initial Weights Comment
    WideResNet38 ilsvrc-cls_rna-a1_cls1000_ep-0001.pth (402M) Converted from mxnet
    VGG16 vgg16_20M.pth (79M) Converted from Caffe
    ResNet50 resnet50-19c8e357.pth PyTorch official
    ResNet101 resnet101-5d3b4d8f.pth PyTorch official

Training, Inference and Evaluation

The directory launch contains template bash scripts for training, inference and evaluation.

Training. For each run, you need to specify names of two variables, for example

EXP=baselines
RUN_ID=v01

Running bash ./launch/run_voc_resnet38.sh will create a directory ./logs/pascal_voc/baselines/v01 with tensorboard events and will save snapshots into ./snapshots/pascal_voc/baselines/v01.

Inference. To generate final masks, please, use the script ./launch/infer_val.sh. You will need to specify:

  • EXP and RUN_ID you used for training;
  • OUTPUT_DIR the path where to save the masks;
  • FILELIST specifies the file to the data split;
  • SNAPSHOT specifies the model suffix in the format e000Xs0.000. For example, e020Xs0.928;
  • (optionally) EXTRA_ARGS specify additional arguments to the inference script.

Evaluation. To compute IoU of the masks, please, run ./launch/eval_seg.sh. You will need to specify SAVE_DIR that contains the masks and FILELIST specifying the split for evaluation.

Pre-trained model

For testing, we provide our pre-trained WideResNet38 model:

Backbone Val Val (+ CRF) Link
WideResNet38 59.7 62.7 model_enc_e020Xs0.928.pth (527M)

The also release the masks predicted by this model:

Split IoU IoU (+ CRF) Link Comment
train-clean (VOC+SBD) 64.7 66.9 train_results_clean.tgz (2.9G) Reported IoU is for VOC
val-clean 63.4 65.3 val_results_clean.tgz (423M)
val 59.7 62.7 val_results.tgz (427M)
test 62.7 64.3 test_results.tgz (368M)

The suffix -clean means we used ground-truth image-level labels to remove masks of the categories not present in the image. These masks are commonly used as pseudo ground truth to train another segmentation model in fully supervised regime.

Acknowledgements

We thank PyTorch team, and Jiwoon Ahn for releasing his code that helped in the early stages of this project.

Citation

We hope that you find this work useful. If you would like to acknowledge us, please, use the following citation:

@InProceedings{Araslanov:2020:SSS,
author = {Araslanov, Nikita and Roth, Stefan},
title = {Single-Stage Semantic Segmentation From Image Labels},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
pages = {4253--4262}
year = {2020}
}

More Repositories

1

n3net

Neural Nearest Neighbors Networks (NIPS*2018)
Python
284
star
2

self-mono-sf

Self-Supervised Monocular Scene Flow Estimation (CVPR 2020)
Python
248
star
3

irr

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation (CVPR 2019)
Python
192
star
4

dense-ulearn-vos

Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)
Python
181
star
5

da-sac

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)
Python
148
star
6

dpp

Detail-Preserving Pooling in Deep Networks (CVPR 2018)
Cuda
115
star
7

multi-mono-sf

Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)
Python
99
star
8

ppac_refinement

Probabilistic Pixel-Adaptive Refinement Networks (CVPR 2020)
Python
77
star
9

cos-cvae

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
Jupyter Notebook
37
star
10

lnfmm

Latent Normalizing Flows for Many-to-Many Cross Domain Mappings (ICLR 2020)
Python
33
star
11

adapter_plus

[CVPR 2024] Official implementation of "Adapters Strike Back"
Python
29
star
12

cad

Content-Adaptive Downsampling in Convolutional Neural Networks (CVPR 2023 Workshop on Efficient Deep Learning for Computer Vision)
Python
23
star
13

veto

Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)
Jupyter Notebook
21
star
14

funnybirds

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
JavaScript
19
star
15

acis

Actor-Critic Instance Segmentation (CVPR 2019)
Lua
19
star
16

deblur-devil

Deep Video Deblurring: The Devil is in the Details (ICCV Workshop 2019)
Python
17
star
17

self-adaptive

Semantic Self-adaptation: Enhancing Generalization with a Single Sample
Python
17
star
18

fast-axiomatic-attribution

Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)
Jupyter Notebook
15
star
19

mar-scf

Normalizing Flows with Multi-Scale Autoregressive Priors (CVPR 2020)
Python
15
star
20

funnybirds-framework

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
Python
13
star
21

fldr-vfi

Efficient Feature Extraction for High-resolution Video Frame Interpolation (BMVC 2022)
Python
11
star
22

primaps

11
star
23

pixelpyramids

PixelPyramids: Exact Inference Models from Lossless Image Pyramids (ICCV 2021)
Python
10
star
24

s2-flow

S2-Flow: Joint Semantic and Style Editing of Facial Images (BMVC 2022)
Python
7
star
25

style-seqcvae

Diverse Image Captioning with Grounded Style (GCPR 2021)
Python
6
star
26

semantic_lattice

Semantic Lattice (GCPR 2019)
Python
5
star
27

jwae

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings (ICCV 2019 Workshop on Cross-Modal Learning in Real World)
Python
5
star
28

probflow

ProbFlow: Joint Optical Flow and Uncertainty Estimation (ICCV 2017)
MATLAB
4
star
29

DIAGen

DIAGen: Semantically Diverse Image Augmentation with Generative Models for Few-Shot Learning (GCPR 2024)
Python
4
star
30

benchmarking-synthetic-clones

Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images (CVPRW 2024)
3
star
31

svigl

Stochastic Variational Inference with Gradient Linearization (CVPR 2018)
MATLAB
2
star
32

playing-for-data

Playing for data: Ground Truth from Computer Games (ECCV 2016)
C++
2
star
33

mirrorflow

MirrorFlow: Exploiting Symmetries in Joint Optical Flow and Occlusion Estimation (ICCV 2017)
C++
1
star
34

matryoshka

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
Python
1
star