• Stars
    star
    392
  • Rank 109,735 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created over 4 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ECCV 2020] Learning stereo from single images using monocular depth estimation networks

Learning Stereo from Single Images

Jamie Watson, Oisin Mac Aodha, Daniyar Turmukhambetov, Gabriel J. Brostow and Michael Firman โ€“ ECCV 2020 (Oral presentation)

Link to paper

2 minute ECCV presentation video link

10 minute ECCV presentation video link

Training data and results qualitative comparison

Supervised deep networks are among the best methods for finding correspondences in stereo image pairs. Like all supervised approaches, these networks require ground truth data during training. However, collecting large quantities of accurate dense correspondence data is very challenging. We propose that it is unnecessary to have such a high reliance on ground truth depths or even corresponding stereo pairs.

Overview of our stereo data generation approach

Inspired by recent progress in monocular depth estimation, we generate plausible disparity maps from single images. In turn, we use those flawed disparity maps in a carefully designed pipeline to generate stereo training pairs. Training in this manner makes it possible to convert any collection of single RGB images into stereo training data. This results in a significant reduction in human effort, with no need to collect real depths or to hand-design synthetic data. We can consequently train a stereo matching network from scratch on datasets like COCO, which were previously hard to exploit for stereo.

Depth maps produced by stereo networks trained with Sceneflow and our method

Through extensive experiments we show that our approach outperforms stereo networks trained with standard synthetic datasets, when evaluated on KITTI, ETH3D, and Middlebury.

Quantitative comparison of stereo networks trained with Sceneflow and our method

โœ๏ธ ๐Ÿ“„ Citation

If you find our work useful or interesting, please consider citing our paper:

@inproceedings{watson-2020-stereo-from-mono,
 title   = {Learning Stereo from Single Images},
 author  = {Jamie Watson and
            Oisin Mac Aodha and
            Daniyar Turmukhambetov and
            Gabriel J. Brostow and
            Michael Firman
           },
 booktitle = {European Conference on Computer Vision ({ECCV})},
 year = {2020}
}

๐Ÿ“Š Evaluation

We evaluate our performance on several datasets: KITTI (2015 and 2012), Middlebury (full resolution) and ETH3D (Low res two view). To run inference on these datasets first download them, and update paths_config.yaml to point to these locations.

Note that we report scores on the training sets of each dataset since we never see these images during training.

Run evaluation using:

CUDA_VISIBLE_DEVICES=X  python main.py \
  --mode inference \
  --load_path <downloaded_model_path> 

optionally setting --test_data_types and --save_disparities.

A trained model can be found HERE.

๐ŸŽฏ Training

To train a new model, you will need to download several datasets: ADE20K, DIODE, Depth in the Wild, Mapillary and COCO. After doing so, update paths_config.yaml to point to these directories.

Additionally you will need some precomputed monocular depth estimates for these images. We provide these for MiDaS: ADE20K, DIODE, Depth in the Wild, Mapillary and COCO. Download these and put them in the corresponding data paths (i.e. your paths specified in paths_config.yaml).

Now you can train a new model using:

CUDA_VISIBLE_DEVICES=X  python  main.py --mode train \
 --log_path <where_to_save_your_model> \
 --model_name <name_of_your_model>

Please see options.py for full list of training options.

๐Ÿ‘ฉโ€โš–๏ธ License

Copyright ยฉ Niantic, Inc. 2020. Patent Pending. All rights reserved. Please see the license file for terms.

More Repositories

1

monodepth2

[ICCV 2019] Monocular depth estimation from a single image
Jupyter Notebook
4,086
star
2

simplerecon

[ECCV 2022] SimpleRecon: 3D Reconstruction Without 3D Convolutions
Python
1,304
star
3

acezero

[ECCV 2024 - Oral] ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.
Python
623
star
4

manydepth

[CVPR 2021] Self-supervised depth estimation from short sequences
Python
620
star
5

mickey

[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
Python
417
star
6

ace

[CVPR 2023 - Highlight] Accelerated Coordinate Encoding (ACE): Learning to Relocalize in Minutes using RGB and Poses
Python
353
star
7

diffusionerf

[CVPR 2023] DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models
Python
286
star
8

map-free-reloc

[ECCV 2022] Map-free Visual Relocalization: Metric Pose Relative to a Single Image
Python
248
star
9

wavelet-monodepth

[CVPR 2021] Monocular depth estimation using wavelets for efficiency
Jupyter Notebook
226
star
10

footprints

[CVPR 2020] Estimation of the visible and hidden traversable space from a single color image
Python
220
star
11

depth-hints

[ICCV 2019] Depth Hints are complementary depth suggestions which improve monocular depth estimation algorithms trained from stereo pairs
Jupyter Notebook
185
star
12

doubletake

[ECCV 2024] DoubleTake: Geometry Guided Depth Estimation
Python
134
star
13

marepo

[CVPR 2024 Highlight] Map-Relative Pose Regression for Visual Re-Localization
Python
126
star
14

nerf-object-removal

[CVPR 2023] Removing Objects From Neural Radiance Fields
Python
100
star
15

implicit-depth

[CVPR 2023] Virtual Occlusions Through Implicit Depth
Python
79
star
16

scoring-without-correspondences

[CVPR 2023] Two-view Geometry Scoring Without Correspondences
Python
79
star
17

rectified-features

[ECCV 2020] Single image depth prediction allows us to rectify planar surfaces in images and extract view-invariant local features for better feature matching
63
star
18

image-box-overlap

[ECCV 2020] Training neural networks to predict visual overlap of images, through interpretable non-metric box embeddings
Jupyter Notebook
53
star
19

airplanes

[CVPR 2024] AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings
Python
49
star
20

panoptic-forecasting

[CVPR 2021] Forecasting the panoptic segmentation of future video frames
Python
47
star
21

relpose-gnn

[3DV21] Visual Camera Re-Localization Using Graph Neural Networks and Relative Pose Supervision, M. Tรผrkoวงlu et al.
Python
39
star
22

modron

Modron - Cloud security compliance
JavaScript
32
star
23

nianticlabs.github.io

HTML
6
star
24

time-repeatability

[ICRA 2021] Learning to Predict Repeatability of Interest Points
6
star
25

metagame-balance

[AAMAS 2023] Bilevel Entropy based Mechanism Design for Balancing Meta in Video Games
Python
5
star
26

nagatha

Nagatha - Alerts without fatigue
1
star