• Stars
    star
    144
  • Rank 255,590 (Top 6 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [ICCV 2023]

Project Page | Paper | Data

IMAGE ALT TEXT HERE

Abstract

Dense 3D reconstruction and ego-motion estimation are key challenges in autonomous driving and robotics. Compared to the complex, multi-modal systems deployed today, multi-camera systems provide a simpler, low-cost alternative. However, camera-based 3D reconstruction of complex dynamic scenes has proven extremely difficult, as existing solutions often produce incomplete or incoherent results. We propose R3D3, a multi-camera system for dense 3D reconstruction and ego-motion estimation. Our approach iterates between geometric estimation that exploits spatial-temporal information from multiple cameras, and monocular depth refinement. We integrate multi-camera feature correlation and dense bundle adjustment operators that yield robust geometric depth and pose estimates. To improve reconstruction where geometric depth is unreliable, e.g. for moving objects or low-textured regions, we introduce learnable scene priors via a depth refinement network. We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments. Consequently, we achieve state-of-the-art dense depth prediction on the DDAD and nuScenes benchmarks.

Getting Started

  1. Clone the repo using the --recursive flag
git clone --recurse-submodules https://github.com/AronDiSc/r3d3.git
cd r3d3
  1. Creating a new anaconda environment using the provided .yaml file
conda env create --file environment.yaml
conda activate r3d3
  1. Compile the extensions (takes about 10 minutes)
python setup.py install

Datasets

The datasets should be placed at data/datasets/<dataset>

DDAD

Download the DDAD dataset and place it at data/datasets/DDAD. We use the masks provided by SurroundDepth. Place them at data/datasets/DDAD/<scene>/occl_mask/<cam>/mask.png. The DDAD datastructure should look as follows:

R3D3
    ├ data
        ├ datasets
            ├ DDAD
                ├ <scene>
                    ├ calibration
                        â”” ....json
                    ├ point_cloud
                        â”” <cam>
                            â”” ....npz
                    ├ occl_mask
                        â”” <cam>
                            â”” ....png
                    ├ rgb
                        â”” <cam>
                            â”” ....png
                    
                    â”” scene_....json
                â”” ...
            â”” ...
        â”” ...
    â”” ...

nuScenes

Download the nuScenes dataset and place it at data/datasets/nuScenes. We use the provide self-occlusion masks. Place them at data/datasets/nuScenes/mask/<cam>.png. The nuScenes datastructure should look as follows:

R3D3
    ├ data
        ├ datasets
            ├ nuScenes
                ├ mask
                    ├ CAM_....png
                ├ samples
                    ├ CAM_...
                        â”” ....jpg
                    â”” LIDAR_TOP
                        â”” ....pcd.bin
                ├ sweeps
                    ├ CAM_...
                        â”” ....jpg
                ├  v1.0-trainval
                    â”” ...
                â”” ...
            â”” ...
        â”” ...
    â”” ...

Models

VKITTI2 Finetuned Feature-Matching

Download the weights for the feature- and context-encoders as well as the GRU from here: r3d3_finetuned.ckpt. Place it at:

R3D3
    ├ data
        ├ models
            ├ r3d3
                â”” r3d3_finetuned.ckpt
            â”” ...
        â”” ...
    â”” ...

Completion Network

We provide completion network weights for the DDAD and nuScenes datasets.

Dataset Abs Rel Sq Rel RMSE delta < 1.25 Download
DDAD 0.162 3.019 11.408 0.811 completion_ddad.ckpt
nuScenes 0.253 4.759 7.150 0.729 completion_nuscenes.ckpt

Place them at:

R3D3
    ├ data
        ├ models
            ├ completion
                ├ completion_ddad.ckpt
                â”” completion_nuscenes.ckpt
            â”” ...
        â”” ...
    â”” ...

Training

Droid-SLAM Finetuning

We finetune the provided droid.pth checkpoint on VKITTI2 by using the Droid-SLAM code-base.

Completion Network

1. Generate Training Data

# DDAD
python evaluate.py \
    --config configs/evaluation/dataset_generation/dataset_generation_ddad.yaml \
    --r3d3_weights=data/models/r3d3/r3d3_finetuned.ckpt \
    --r3d3_image_size 384 640 \
    --r3d3_n_warmup=5 \
    --r3d3_optm_window=5 \
    --r3d3_corr_impl=lowmem \
    --r3d3_graph_type=droid_slam \
    --training_data_path=./data/datasets/DDAD 

# nuScenes
python evaluate.py \
    --config configs/evaluation/dataset_generation/dataset_generation_nuscenes.yaml \
    --r3d3_weights=data/models/r3d3/r3d3_finetuned.ckpt \
    --r3d3_image_size 448 768 \
    --r3d3_n_warmup=5 \
    --r3d3_optm_window=5 \
    --r3d3_corr_impl=lowmem \
    --r3d3_graph_type=droid_slam \
    --training_data_path=./data/datasets/nuScenes 

2. Completion Network Training

# DDAD
python train.py configs/training/depth_completion/r3d3_completion_ddad_stage_1.yaml
python train.py configs/evaluation/depth_completion/r3d3_completion_ddad_inf_depth.yaml --arch.model.checkpoint=<path to stage 1 model>.ckpt
python train.py configs/training/depth_completion/r3d3_completion_ddad_stage_2.yaml --arch.model.checkpoint=<path to stage 1 model>.ckpt

# nuScenes
python train.py configs/training/depth_completion/r3d3_completion_nuscenes_stage_1.yaml
python train.py configs/evaluation/depth_completion/r3d3_completion_nuscenes_inf_depth.yaml --arch.model.checkpoint=<path to stage 1 model>.ckpt
python train.py configs/training/depth_completion/r3d3_completion_nuscenes_stage_2.yaml --arch.model.checkpoint=<path to stage 1 model>.ckpt

Evaluation

# DDAD
python evaluate.py \
    --config configs/evaluation/r3d3/r3d3_evaluation_ddad.yaml \
    --r3d3_weights data/models/r3d3/r3d3_finetuned.ckpt \
    --r3d3_image_size 384 640 \
    --r3d3_init_motion_only \
    --r3d3_n_edges_max=84 

# nuScenes
python evaluate.py \
    --config configs/evaluation/r3d3/r3d3_evaluation_nuscenes.yaml \
    --r3d3_weights data/models/r3d3/r3d3_finetuned.ckpt \
    --r3d3_image_size 448 768 \
    --r3d3_init_motion_only \
    --r3d3_dt_inter=0 \
    --r3d3_n_edges_max=72 

Citation

If you find the code helpful in your research or work, please cite the following paper.

@inproceedings{r3d3,
  title={R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras},
  author={Schmied, Aron and Fischer, Tobias and Danelljan, Martin and Pollefeys, Marc and Yu, Fisher},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  year={2023}
}

Acknowledgements

  • This repository is based on Droid-SLAM.
  • The implementation of the completion network is based on Monodepth2.
  • The vidar framework is used for training, evaluation and logging results.

More Repositories

1

sam-hq

Segment Anything in High Quality [NeurIPS 2023]
Python
3,689
star
2

sam-pt

SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.
Python
970
star
3

transfiner

Mask Transfiner for High-Quality Instance Segmentation, CVPR 2022
Python
525
star
4

qd-3dt

Official implementation of Monocular Quasi-Dense 3D Object Tracking, TPAMI 2022
Python
515
star
5

qdtrack

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)
Python
382
star
6

pcan

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight
Python
362
star
7

MaskFreeVIS

Mask-Free Video Instance Segmentation [CVPR 2023]
Python
358
star
8

bdd100k-models

Model Zoo of BDD100K Dataset
Python
285
star
9

idisc

iDisc: Internal Discretization for Monocular Depth Estimation [CVPR 2023]
Python
279
star
10

LiDAR_snow_sim

LiDAR snowfall simulation
Python
172
star
11

P3Depth

Python
123
star
12

shift-dev

SHIFT Dataset DevKit - CVPR2022
Python
103
star
13

cascade-detr

[ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection
Python
92
star
14

tet

Implementation of Tracking Every Thing in the Wild, ECCV 2022
Python
69
star
15

TrafficBots

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction. ICRA 2023. Code is now available at https://github.com/zhejz/TrafficBots
51
star
16

nutsh

A Platform for Visual Learning from Human Feedback
TypeScript
42
star
17

vmt

Video Mask Transfiner for High-Quality Video Instance Segmentation (ECCV'2022)
Jupyter Notebook
29
star
18

spc2

Instance-Aware Predictive Navigation in Multi-Agent Environments, ICRA 2021
Python
20
star
19

CISS

Unsupervised condition-level adaptation for semantic segmentation
Python
20
star
20

shift-detection-tta

This repository implements continuous test-time adaptation algorithms for object detection on the SHIFT dataset.
Python
18
star
21

vis4d

A modular library for visual 4D scene understanding
Python
17
star
22

dla-afa

Official implementation of Dense Prediction with Attentive Feature Aggregation, WACV 2023
Python
12
star
23

soccer-player

Python
8
star
24

project-template

Python
4
star
25

vis4d_cuda_ops

Cuda
3
star
26

vis4d-template

Vis4D Template.
Shell
3
star