• Stars
    star
    181
  • Rank 212,110 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 3 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation

License Framework

This repository contains the official implementation of our paper:

Dense Unsupervised Learning for Video Segmentation
Nikita Araslanov, Simone Schaub-Mayer and Stefan Roth
To appear at NeurIPS*2021. [paper] [supp] [talk] [example results] [arXiv]

drawing

We efficiently learn spatio-temporal correspondences
without any supervision, and achieve state-of-the-art
accuracy of video object segmentation.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de


Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least one Titan X GPUs (12GB) or equivalent is required. The code was primarily developed under PyTorch 1.8 on a single A100 GPU.

The following steps will set up a local copy of the repository.

  1. Create conda environment:
conda create --name dense-ulearn-vos
source activate dense-ulearn-vos
  1. Install PyTorch >=1.4 (see PyTorch instructions). For example on Linux, run:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
  1. Install the dependencies:
pip install -r requirements.txt
  1. Download the data:
Dataset Website Target directory with video sequences
YouTube-VOS Link data/ytvos/train/JPEGImages/
OxUvA Link data/OxUvA/images/dev/
TrackingNet Link data/tracking/train/jpegs/
Kinetics-400 Link data/kinetics400/video_jpeg/train/

The last column in this table specifies a path to subdirectories (relative to the project root) containing images of video frames. You can obviously use a different path structure. In this case, you will need to adjust the paths in data/filelists/ for every dataset accordingly.

  1. Download filelists:
cd data/filelists
bash download.sh

This will download lists of training and validation paths for all datasets.

Training

We following bash script will train a ResNet-18 model from scratch on one of the four supported datasets (see above):

bash ./launch/train.sh [ytvos|oxuva|track|kinetics]

We also provide our final models for download.

Dataset Mean J&F (DAVIS-2017) Link MD5
OxUvA 65.3 oxuva_e430_res4.pth (132M) af541[...]d09b3
YouTube-VOS 69.3 ytvos_e060_res4.pth (132M) c3ae3[...]55faf
TrackingNet 69.4 trackingnet_e088_res4.pth (88M) 3e7e9[...]95fa9
Kinetics-400 68.7 kinetics_e026_res4.pth (88M) 086db[...]a7d98

Inference and evaluation

Inference

To run the inference use launch/infer_vos.sh:

bash ./launch/infer_vos.sh [davis|ytvos]

The first argument selects the validation dataset to use (davis for DAVIS-2017; ytvos for YouTube-VOS). The bash variables declared in the script further help to set up the paths for reading the data and the pre-trained models as well as the output directory:

  • EXP, RUN_ID and SNAPSHOT determine the pre-trained model to load.
  • VER specifies a suffix for the output directory (in case you would like to experiment with different configurations for label propagation). Please, refer to launch/infer_vos.sh for their usage.

The inference script will create two directories with the result: [res3|res4|key]_vos and [res3|res4|key]_vis, where the prefix corresponds to the codename of the output CNN layer used in the evaluation (selected in infer_vos.sh using KEY variable). The vos-directory contains the segmentation result ready for evaluation; the vis-directory produces the results for visualisation purposes. You can optionally disable generating the visualisation by setting VERBOSE=False in infer_vos.py.

Evaluation: DAVIS-2017

Please use the official evaluation package. Install the repository, then simply run:

python evaluation_method.py --task semi-supervised --davis_path data/davis2017 --results_path <path-to-vos-directory>

Evaluation: YouTube-VOS 2018

Please use the official CodaLab evaluation server. To create the submission, rename the vos-directory to Annotations and compress it to Annotations.zip for uploading.

Acknowledgements

We thank PyTorch contributors and Allan Jabri for releasing their implementation of the label propagation.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DUL,
  author    = {Araslanov, Nikita and Simone Schaub-Mayer and Roth, Stefan},
  title     = {Dense Unsupervised Learning for Video Segmentation},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  volume    = {34},
  year = {2021}
}

More Repositories

1

1-stage-wseg

Single-Stage Semantic Segmentation from Image Labels (CVPR 2020)
Python
379
star
2

n3net

Neural Nearest Neighbors Networks (NIPS*2018)
Python
284
star
3

self-mono-sf

Self-Supervised Monocular Scene Flow Estimation (CVPR 2020)
Python
248
star
4

irr

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation (CVPR 2019)
Python
192
star
5

da-sac

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)
Python
148
star
6

dpp

Detail-Preserving Pooling in Deep Networks (CVPR 2018)
Cuda
115
star
7

multi-mono-sf

Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)
Python
99
star
8

ppac_refinement

Probabilistic Pixel-Adaptive Refinement Networks (CVPR 2020)
Python
77
star
9

cos-cvae

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
Jupyter Notebook
37
star
10

lnfmm

Latent Normalizing Flows for Many-to-Many Cross Domain Mappings (ICLR 2020)
Python
33
star
11

adapter_plus

[CVPR 2024] Official implementation of "Adapters Strike Back"
Python
29
star
12

cad

Content-Adaptive Downsampling in Convolutional Neural Networks (CVPR 2023 Workshop on Efficient Deep Learning for Computer Vision)
Python
23
star
13

veto

Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)
Jupyter Notebook
21
star
14

funnybirds

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
JavaScript
19
star
15

acis

Actor-Critic Instance Segmentation (CVPR 2019)
Lua
19
star
16

deblur-devil

Deep Video Deblurring: The Devil is in the Details (ICCV Workshop 2019)
Python
17
star
17

self-adaptive

Semantic Self-adaptation: Enhancing Generalization with a Single Sample
Python
17
star
18

fast-axiomatic-attribution

Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)
Jupyter Notebook
15
star
19

mar-scf

Normalizing Flows with Multi-Scale Autoregressive Priors (CVPR 2020)
Python
15
star
20

funnybirds-framework

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
Python
13
star
21

fldr-vfi

Efficient Feature Extraction for High-resolution Video Frame Interpolation (BMVC 2022)
Python
11
star
22

primaps

11
star
23

pixelpyramids

PixelPyramids: Exact Inference Models from Lossless Image Pyramids (ICCV 2021)
Python
10
star
24

s2-flow

S2-Flow: Joint Semantic and Style Editing of Facial Images (BMVC 2022)
Python
7
star
25

style-seqcvae

Diverse Image Captioning with Grounded Style (GCPR 2021)
Python
6
star
26

semantic_lattice

Semantic Lattice (GCPR 2019)
Python
5
star
27

jwae

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings (ICCV 2019 Workshop on Cross-Modal Learning in Real World)
Python
5
star
28

probflow

ProbFlow: Joint Optical Flow and Uncertainty Estimation (ICCV 2017)
MATLAB
4
star
29

DIAGen

DIAGen: Semantically Diverse Image Augmentation with Generative Models for Few-Shot Learning (GCPR 2024)
Python
4
star
30

benchmarking-synthetic-clones

Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images (CVPRW 2024)
3
star
31

svigl

Stochastic Variational Inference with Gradient Linearization (CVPR 2018)
MATLAB
2
star
32

playing-for-data

Playing for data: Ground Truth from Computer Games (ECCV 2016)
C++
2
star
33

mirrorflow

MirrorFlow: Exploiting Symmetries in Joint Optical Flow and Occlusion Estimation (ICCV 2017)
C++
1
star
34

matryoshka

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
Python
1
star