• Stars
    star
    183
  • Rank 205,902 (Top 5 %)
  • Language
    Jupyter Notebook
  • License
    Other
  • Created almost 5 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICCV 2019] Depth Hints are complementary depth suggestions which improve monocular depth estimation algorithms trained from stereo pairs

Self-Supervised Monocular Depth Hints

Jamie Watson, Michael Firman, Gabriel J. Brostow and Daniyar Turmukhambetov โ€“ย ICCV 2019

[Link to paper]

example input output gif

We introduce Depth Hints, which improve monocular depth estimation algorithms trained from stereo pairs.

We find that photometric reprojection losses used with self-supervised learning typically have multiple local minima.
This can restrict what a regression network learns, for example causing artifacts around thin structures.

Depth Hints are complementary depth suggestions obtained from simple off-the-shelf stereo algorithms, e.g. Semi-Global Matching. These hints are used during training to guide the network to learn better weights. They require no additional data, and are assumed to be right only sometimes.

Combined with other good practices, Depth Hints gives state-of-the-art depth predictions on the KITTI benchmark (see images above and results table below). We show additional monocular depth estimation results on the sceneflow dataset:

example input output gif

โœ๏ธ ๐Ÿ“„ Citation

If you find our work useful or interesting, please consider citing our paper:

@inproceedings{watson-2019-depth-hints,
  title     = {Self-Supervised Monocular Depth Hints},
  author    = {Jamie Watson and
               Michael Firman and
               Gabriel J. Brostow and
               Daniyar Turmukhambetov},
  booktitle = {The International Conference on Computer Vision (ICCV)},
  month = {October},
  year = {2019}
}

๐Ÿ“ˆ KITTI Results

Model name Training modality ImageNet pretrained Resolution Abs rel Sq rel ๐›ฟ < 1.25
Ours Resnet50 Stereo Yes 640 x 192 0.102 0.762 0.880
Ours Resnet50 no pt Stereo No 640 x 192 0.118 0.941 0.850
Ours HR Resnet50 Stereo Yes 1024 x 320 0.096 0.710 0.890
Ours HR Resnet50 no pt Stereo No 1024 x 320 0.112 0.857 0.861
Ours HR Mono + Stereo Yes 1024 x 320 0.098 0.702 0.887

Please see the paper for full results. To download the weights and predictions for each model please follow the links below:

Model name Training modality ImageNet pretrained Resolution Weights Eigen Predictions
Ours Resnet50 Stereo Yes 640 x 192 Download Download
Ours Resnet50 no pt Stereo No 640 x 192 Download Download
Ours HR Resnet50 Stereo Yes 1024 x 320 Download Download
Ours HR Resnet50 no pt Stereo No 1024 x 320 Download Download
Ours HR Mono + Stereo Yes 1024 x 320 Download Download

โš™๏ธ Code

The code for Depth Hints builds upon monodepth2. If you have questions about running the code, please see the issues in that repository first.

To train using depth hints:

  • Clone this repository
  • Run python precompute_depth_hints.py --data_path <your_KITTI_path>, optionally setting --save_path (will default to <data_path>/depth_hints) and --filenames (will default to training and validation images for the eigen split). This will create the "fused" depth hints referenced in the paper. This process takes approximately 4 hours on a GPU.
  • Add the flag --use_depth_hints to your usual monodepth2 training command, optionally also setting --depth_hint_path (will default to <data_path>/depth_hints). See below for a full command.

๐ŸŽ‰ And that's it! ๐ŸŽ‰

๐Ÿ‘€ Reproducing Paper Results

To recreate the results from our paper, run:

python train.py
  --data_path <your_KITTI_path>
  --log_dir <your_save_path>
  --model_name stereo_depth_hints
  --use_depth_hints
  --depth_hint_path <your_depth_hint_path>
  --frame_ids 0  --use_stereo
  --scheduler_step_size 5
  --split eigen_full
  --disparity_smoothness 0

Additionally:

  • For Resnet50 models, add --num_layers 50
  • Add --height 320 --width 1024 for High Resolution models (you may also have to set --batch_size 6 depending on the size of your GPU)
  • For Mono + Stereo add --frame_ids 0 -1 1 and remove --split eigen_full

The results above and in the main paper arise from evaluating on the KITTI sparse LiDAR point cloud, using the Eigen Test split.

To test on KITTI, run:

python evaluate_depth.py
  --data_path <your_KITTI_path>
  --load_weights_folder <your_model_path>
  --use_stereo

Make sure you have run export_gt_depth.py to extract ground truth files.

Additionally, if you see ValueError: Object arrays cannot be loaded when allow_pickle=False, then either downgrade numpy, or change line 166 in evaluate_depth.py to

gt_depths = np.load(gt_path, fix_imports=True, encoding='latin1', allow_pickle=True)["data"]

๐Ÿ–ผ Running on your own images

To run on your own images, run:

python test_simple.py
  --image_path <your_image_path>
  --model_path <your_model_path>
  --num_layers <18 or 50>

This will save a numpy array of depths, and a colormapped depth image.

๐Ÿ‘ฉโ€โš–๏ธ License

Copyright ยฉ Niantic, Inc. 2020. Patent Pending. All rights reserved. Please see the license file for terms.

More Repositories

1

monodepth2

[ICCV 2019] Monocular depth estimation from a single image
Jupyter Notebook
4,013
star
2

simplerecon

[ECCV 2022] SimpleRecon: 3D Reconstruction Without 3D Convolutions
Python
1,252
star
3

manydepth

[CVPR 2021] Self-supervised depth estimation from short sequences
Python
600
star
4

stereo-from-mono

[ECCV 2020] Learning stereo from single images using monocular depth estimation networks
Python
379
star
5

mickey

[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
Python
335
star
6

ace

[CVPR 2023 - Highlight] Accelerated Coordinate Encoding (ACE): Learning to Relocalize in Minutes using RGB and Poses
Python
328
star
7

diffusionerf

[CVPR 2023] DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models
Python
281
star
8

wavelet-monodepth

[CVPR 2021] Monocular depth estimation using wavelets for efficiency
Jupyter Notebook
221
star
9

footprints

[CVPR 2020] Estimation of the visible and hidden traversable space from a single color image
Python
220
star
10

map-free-reloc

[ECCV 2022] Map-free Visual Relocalization: Metric Pose Relative to a Single Image
Python
197
star
11

acezero

ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.
171
star
12

nerf-object-removal

[CVPR 2023] Removing Objects From Neural Radiance Fields
Python
95
star
13

marepo

[CVPR 2024 Highlight] Map-Relative Pose Regression for Visual Re-Localization
Python
79
star
14

scoring-without-correspondences

[CVPR 2023] Two-view Geometry Scoring Without Correspondences
Python
78
star
15

implicit-depth

[CVPR 2023] Virtual Occlusions Through Implicit Depth
Python
72
star
16

rectified-features

[ECCV 2020] Single image depth prediction allows us to rectify planar surfaces in images and extract view-invariant local features for better feature matching
63
star
17

image-box-overlap

[ECCV 2020] Training neural networks to predict visual overlap of images, through interpretable non-metric box embeddings
Jupyter Notebook
53
star
18

panoptic-forecasting

[CVPR 2021] Forecasting the panoptic segmentation of future video frames
Python
47
star
19

relpose-gnn

[3DV21] Visual Camera Re-Localization Using Graph Neural Networks and Relative Pose Supervision, M. Tรผrkoวงlu et al.
Python
37
star
20

modron

Modron - Cloud security compliance
JavaScript
33
star
21

time-repeatability

[ICRA 2020] Learning to Predict Repeatability of Interest Points
6
star
22

nianticlabs.github.io

HTML
4
star
23

metagame-balance

[AAMAS 2023] Bilevel Entropy based Mechanism Design for Balancing Meta in Video Games
Python
4
star