• Stars
    star
    108
  • Rank 321,259 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Geometry meets semantics for semi-supervised monocular depth estimation - ACCV 2018

Semantic-Mono-Depth

image

This repository contains the source code of Semantic-Mono-Depth, proposed in the paper "Geometry meets semantics for semi-supervised monocular depth estimation", ACCV 2018. If you use this code in your projects, please cite our paper:

@inproceedings{ramirez2018,
  title     = {Geometry meets semantic for semi-supervised monocular depth estimation},
  author    = {Zama Ramirez, Pierluigi and
                Poggi, Matteo and
                Tosi, Fabio and
                Mattoccia, Stefano and
                Di Stefano, Luigi},
  booktitle = {14th Asian Conference on Computer Vision (ACCV)},
  year = {2018}
}

Abstract

Depth estimation from a single image represents a very exciting challenge in computer vision. While other image-based depth sensing techniques leverage on the geometry between different viewpoints(e.g., stereo or structure from motion), the lack of these cues within a single image renders ill-posed the monocular depth estimation task. For inference, state-of-the-art encoder-decoder architectures for monocular depth estimation rely on effective feature representations learned at training time. For unsupervised training of these models, geometry has been effectively exploited by suitable images warping losses computed from views acquired by a stereo rig or a moving camera. In this paper, we make a further step forward showing that learning semantic information from images enables to improve effectively monocular depth estimation as well. In particular, by leveraging on semantically labeled images together with unsupervised signals gained by geometry through an image warping loss, we propose a deep learning approach aimed at joint semantic segmentation and depth estimation. Our overall learning framework is semi-supervised, as we deploy groundtruth data only in the semantic domain. At training time, our network learns a common feature representation for both tasks and a novel cross-task loss function is proposed. The experimental findings show how, jointly tackling depth prediction and semantic segmentation, allows to improve depth estimation accuracy. In particular, on the KITTI dataset our network outperforms state-of-the-art methods for monocular depth estimation.

For more details: arXiv

Requirements

  • Tensorflow 1.5 or higher (recomended)
  • python packages such as opencv, matplotlib

Download pretrain models

Checkpoints can be downloaded from here

Inference and evaluation

python monodepth_main.py --dataset kitti --mode test --data_path $DATA_PATH --output_dir $OUTPUT_DIR --filename ./utils/filenames/kitti_semantic_stereo_2015_test_split.txt --task depth --checkpoint_path $checkpoint_path --encoder $ENCODER

python ./utils/evaluate_kitti.py --split kitti_test --predicted_disp_path $OUTPUT_DIR/disparities_pp.npy --gt_path $DATA_PATH 

DATA_PATH=path_to_dataset OUTPUT_DIR=path_to_output_folder ENCODER=vgg or resnet

More Repositories

1

Real-time-self-adaptive-deep-stereo

Code for "Real-time self-adaptive deep stereo" - CVPR 2019 (ORAL)
Python
420
star
2

Learning2AdaptForStereo

Code for: "Learning To Adapt For Stereo" accepted at CVPR2019
Python
80
star
3

Unsupervised-Adaptation-for-Deep-Stereo

Code for "Unsupervised Adaptation for Deep Stereo" - ICCV17
C++
62
star
4

omeganet

Distilled Semantics for Comprehensive Scene Understanding from Videos [CVPR 2020]
Python
58
star
5

inr2vec

Deep Learning on Implicit Neural Representations of Shapes (ICLR 2023)
Python
49
star
6

neural-disparity-refinement

Python
46
star
7

crossmodal-feature-mapping

Python
30
star
8

Keypoint-Learning

Code for "Learning a Descriptor-Specific 3D Keypoint Detector" and "Learning to detect good 3d keypoints" -ICCV 2015, IJCV 2018
C++
27
star
9

Slam-Dunk-Android

Android implementation of "Fusion of inertial and visual measurements for rgb-d slam on mobile devices"
C++
27
star
10

Depth4ToM-code

Python
20
star
11

cube_slam

Monocular CubeSLAM implementation without ROS integration
C++
17
star
12

compass

Repository containing the code of "Learning to Orient Surfaces by Self-supervised Spherical CNNs".
Python
17
star
13

ATDT

Implementation of "Learning Across Tasks and Domains" ICCV 2019
Python
15
star
14

Feature-Distillation-for-3D-UDA

Python
11
star
15

nf2vec

Deep Learning on Object-centric 3D Neural Fields (TPAMI)
Jupyter Notebook
11
star
16

Shallow_DA

Official Repository for "Shallow Features Guide Unsupervised Domain Adaptation for Semantic Segmentation at Class Boundaries"
Python
10
star
17

MM2D3D

Official code for the paper "Exploiting the Complementarity of 2D and 3D Networks to Address Domain-Shift in 3D Semantic Segmentation"
Python
10
star
18

triplane_processing

Neural Processing of Tri-Plane Hybrid Neural Fields (ICLR 2024)
Python
8
star
19

ComputerVisionImageProcessing-LabSessions

Jupyter Notebook
6
star
20

d4-dbst

Python
6
star
21

Unsupervised_Depth_Adaptation

Code for "Unsupervised Domain Adaptation for Depth Prediction from Images" - Coming Soon...
5
star
22

LLaNA

Official code repository of LLaNA: Large Language and NeRF Assistant
Python
4
star
23

d4-dbst-old

Python
4
star
24

RefRec

Official repository for "RefRec: Pseudo-labels Refinement via Shape Reconstruction for Unsupervised 3D Domain Adaptation"
Python
4
star
25

CrossmodalFeatureMapping

JavaScript
2
star
26

netspace

Python
2
star
27

ComputerVisionCoursePython

This repository contains several Jupyter Notebooks related to the UNIBO CVLab Computer Vision Course.
Jupyter Notebook
2
star
28

Depth4ToM

JavaScript
1
star
29

sister

Multi-view robotic stereo
C++
1
star
30

booster-web

JavaScript
1
star
31

CtS

Official Code for "Boosting Multi-Modal Unsupervised Domain Adaptation for LiDAR Semantic Segmentation by Self-Supervised Depth Completion"
Python
1
star
32

clip2nerf

Connecting NeRFs, Images, and Text (CVPRW 2024)
Python
1
star