• Stars
    star
    105
  • Rank 328,196 (Top 7 %)
  • Language
    Python
  • License
    Mozilla Public Li...
  • Created over 4 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official implementation of "Decoder Modulation for Indoor Depth Completion" https://arxiv.org/abs/2005.08607

Decoder Modulation for Indoor Depth Completion

Decoder Modulation for Indoor Depth Completion
Dmitry Senushkin, Ilia Belikov, Anton Konushin
Samsung Research
https://arxiv.org/abs/2005.08607

Abstract: Accurate depth map estimation is an essential step in scene spatial mapping for AR applications and 3D modeling. Current depth sensors provide time-synchronized depth and color images in real-time, but have limited range and suffer from missing and erroneous depth values on transparent or glossy surfaces. We investigate the task of depth completion that aims at improving the accuracy of depth measurements and recovering the missing depth values using additional information from corresponding color images. Surprisingly, we find that a simple baseline model based on modern encoder-decoder architecture for semantic segmentation achieves state-of-the-art accuracy on standard depth completion benchmarks. Then, we show that the accuracy can be further improved by taking into account a mask of missing depth values. The main contributions of our work are two-fold. First, we propose a modified decoder architecture, where features from raw depth and color are modulated by features from the mask via Spatially-Adaptive Denormalization (SPADE). Second, we introduce a new loss function for depth estimation based on direct comparison of log depth prediction with ground truth values. The resulting model outperforms current state-of-the-art by a large margin on the challenging Matterport3D dataset.

Installation

This implementation is based on Python 3+ and Pytorch 1.4+. We provide two ways of setting up an environment. If you are using Anaconda, the following code performs necessary installation:

conda env create -f environment.yaml
conda activate depth-completion
python setup.py install

The same procedure can be done with pip:

pip3 install -r requirements.txt
python setup.py install

Training

We provide a code for training on Matterport3D. Download Matterpord3D dataset and reorder your root folder as follows:

ROOT/
  ├── data/
  └── splits/
        ├── train.txt
        ├── val.txt
        └── test.txt 

and data directory is should be configured in this order. Be sure that ROOT path in matterport.py is valid. Now you can start training with the following command:

# for LRN decoder with efficientnet-b4 backbone
python train_matterport.py --default_cfg='LRN' --config_file='../configs/LRN_efficientnet-b4_lena.yaml' --postfix='example_lrn' 
# for DM-LRN decoder with efficientnet-b4 backbone
python train_matterport.py --default_cfg='DM-LRN' --config_file='../configs/DM-LRN_efficientnet-b4_pepper.yaml' --postfix='example_dm_lrn' 

Evaluation

We provide scripts for evaluation on Matterport3D. If you need to perform test on NYUv2, see directly into a code since it may be changed in the future. Following instructions performs evaluation on Matterport3D test set:

# for LRN decoder with efficientnet-b4 backbone
python test_net.py --default_cfg='LRN' --config_file='../configs/LRN_efficientnet-b4_lena.yaml' --weights=<path to lrn_b4.pth>
# for DM-LRN decoder with efficientnet-b4 backbone
python test_net.py --default_cfg='DM-LRN' --config_file='../configs/DM-LRN_efficientnet-b4_pepper.yaml' --weights=<path to dm-lrn_b4.pth>
# if you need to visualize the results just add --save_dir argument
python test_net.py --default_cfg='DM-LRN' --config_file='../configs/DM-LRN_efficientnet-b4_pepper.yaml' --weights=<path to dm-lrn_b4.pth> --save_dir=<path to existing folder>

Model ZOO

This repository includes all models mentioned in original paper.

Backbone Decoder
type
Encoder
input
Training loss Link Config
efficientnet-b0 LRN RGBD LogDepthL1loss lrn_b0.pth LRN_efficientnet-b0_suzy.yaml
efficientnet-b1 LRN RGBD LogDepthL1loss lrn_b1.pth LRN_efficientnet-b1_anabel.yaml
efficientnet-b2 LRN RGBD LogDepthL1loss lrn_b2.pth LRN_efficientnet-b2_irina.yaml
efficientnet-b3 LRN RGBD LogDepthL1loss lrn_b3.pth LRN_efficientnet-b3_sara.yaml
efficientnet-b4 LRN RGBD LogDepthL1loss lrn_b4.pth LRN_efficientnet-b4_lena.yaml
efficientnet-b4 LRN RGBD BerHu lrn_b4_berhu.pth LRN_efficientnet-b4_helga.yaml
efficientnet-b4 LRN RGBD+M LogDepthL1loss lrn_b4_mask.pth LRN_efficientnet-b4_simona.yaml
efficientnet-b0 DM-LRN RGBD LogDepthL1Loss dm-lrn_b0.pth DM_LRN_efficientnet-b0_camila.yaml
efficientnet-b1 DM-LRN RGBD LogDepthL1Loss dm-lrn_b1.pth DM_LRN_efficientnet-b1_pamela.yaml
efficientnet-b2 DM-LRN RGBD LogDepthL1Loss dm-lrn_b2.pth DM_LRN_efficientnet-b2_rosaline.yaml
efficientnet-b3 DM-LRN RGBD LogDepthL1Loss dm-lrn_b3.pth DM_LRN_efficientnet-b3_jenifer.yaml
efficientnet-b4 DM-LRN RGBD LogDepthL1Loss dm-lrn_b4.pth DM_LRN_efficientnet-b4_pepper.yaml
efficientnet-b4 DM-LRN RGBD BerHu dm-lrn_b4_berhu.pth DM_LRN_efficientnet-b4_amelia.yaml

License

The code is released under the MPL 2.0 License. MPL is a copyleft license that is easy to comply with. You must make the source code for any of your changes available under MPL, but you can combine the MPL software with proprietary code, as long as you keep the MPL code in separate files.

Citation

If you find this work is useful for your research, please cite our paper:

@article{dmidc2020,
  title={Decoder Modulation for Indoor Depth Completion},
  author={Dmitry Senushkin, Ilia Belikov, Anton Konushin},
  journal={arXiv preprint arXiv:2005.08607},
  year={2020}
}

More Repositories

1

ritm_interactive_segmentation

Reviving Iterative Training with Mask Guidance for Interactive Segmentation
Python
622
star
2

fbrs_interactive_segmentation

[CVPR2020] f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation https://arxiv.org/abs/2001.10331
Jupyter Notebook
581
star
3

NeuralHaircut

Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction. ICCV 2023
Python
510
star
4

rome

Realistic mesh-based avatars. ECCV 2022
Python
424
star
5

adaptis

[ICCV19] AdaptIS: Adaptive Instance Selection Network, https://arxiv.org/abs/1909.07829
Jupyter Notebook
335
star
6

imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
Python
271
star
7

image_harmonization

[WACV2021] Foreground-aware Semantic Representations for Image Harmonization https://arxiv.org/abs/2006.00809
Python
266
star
8

pytorch-ensembles

Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning, ICLR 2020
Jupyter Notebook
236
star
9

fcaf3d

[ECCV2022] FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
Python
223
star
10

iterdet

[S+SSPR2020] IterDet: Iterative Scheme for Object Detection in Crowded Environments
Python
206
star
11

FineControlNet

Official Pytorch Implementation of "FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection", 2023
Python
177
star
12

SPIn-NeRF

3D Scene Inpainting with NeRFs
Jupyter Notebook
167
star
13

TwiTi

This is a project of "#Twiti: Social Listening for Threat Intelligence" (TheWebConf 2021)
Python
167
star
14

zero-cost-nas

Zero-Cost Proxies for Lightweight NAS
Jupyter Notebook
141
star
15

BayesDLL

Python
141
star
16

tr3d

[ICIP2023] TR3D: Towards Real-Time Indoor 3D Object Detection
Python
138
star
17

ASAM

Implementation of ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks, ICML 2021.
Python
138
star
18

MLI

Novel View Synthesis with multiplane/multilayer representation: CVPR2022, WACV2023
Python
136
star
19

td3d

[WACV'24] TD3D: Top-Down Beats Bottom-Up in 3D Instance Segmentation
Python
131
star
20

day-to-night

Python
106
star
21

Butterfly_Acc

The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design"
Verilog
103
star
22

DINAR

Inference code for "DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars"
Python
98
star
23

tqc_pytorch

Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/
Python
87
star
24

SummaryMixing

This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
Python
86
star
25

style-people

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper
Python
72
star
26

MTL

Python
71
star
27

RAMP

[IROS 2023] RAMP: Hierarchical Reactive Motion Planning for Manipulation Tasks Using Implicit Signed Distance Functions
Python
51
star
28

ffc_se

Code for the paper "FFC-SE: Fast Fourier Convolution for Speech Enhancement" (published at Interspeech 2022 conference)
Python
48
star
29

hifi_plusplus

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement (ICASSP 2023)
Python
47
star
30

deep-weight-prior

The Deep Weight Prior, ICLR 2019
Jupyter Notebook
44
star
31

odometry

Training Deep SLAM on Single Frames https://arxiv.org/abs/1912.05405
Python
43
star
32

eagle

Measuring and predicting on-device metrics (latency, power, etc.) of machine learning models
Python
42
star
33

point_based_clothing

Official PyTorch implementation of ICCV'21 paper Point-Based Modeling of Human Clothing
Jupyter Notebook
41
star
34

HandNeRF

Official Pytorch Implementation of "HandNeRF: Learning to Reconstruct Hand-Object Interaction Scene from a Single RGB Image", ICRA 2024
Python
39
star
35

HIO-SDF

[ICRA 2024] HIO-SDF: Hierarchical Incremental Online Signed Distance Fields
Python
39
star
36

gps-augment

Simple but high-performing method for learning a policy of test-time augmentation
Jupyter Notebook
38
star
37

Noise2NoiseFlow

Python
36
star
38

cloud_transformers

[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks https://arxiv.org/abs/2007.11679
Python
33
star
39

SceneGrasp

[IROS 2023] Real-time Simultaneous Multi-Object 3D Shape Reconstruction, 6DoF Pose Estimation and Dense Grasp Prediction
Python
32
star
40

Sparse-Multi-DNN-Scheduling

Open-source artifacts and codes of our MICRO'23 paper titled “Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads”.
Python
32
star
41

Drop-DTW

Python
30
star
42

ltmnet

Learning Tone Curves for Local Image Enhancement
Python
30
star
43

semi-supervised-NFs

Code for the paper Semi-Conditional Normalizing Flows for Semi-Supervised Learning
Python
28
star
44

W2E

This is a project of "Cybersecurity Event Detection with New and Re-emerging Words". (ASIACCS 2020)
28
star
45

FastFlow

FastFlow is a system that automatically detects CPU bottlenecks in deep learning training pipelines and resolves the bottlenecks with data pipeline offloading to remote resources .
Python
25
star
46

geometry-preserving-de

Towards General Purpose, Geometry Preserving Single-View Depth Estimation https://arxiv.org/abs/2009.12419
Python
22
star
47

neural-textures

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.
Python
22
star
48

graphics2raw

Code associated with our paper "Graphics2RAW: Mapping Computer Graphics Images to Sensor RAW Images". The paper has been accepted to the International Conference on Computer Vision (ICCV'23).
Python
22
star
49

nb-asr

Python
21
star
50

FACaP

[IROS 2022]Floorplan-Aware Camera Poses Refinement
Python
21
star
51

content-aware-metadata

Python
20
star
52

coordinate_based_inpainting

[CVPR2019] Coordinate-based texture inpainting for pose-guided human image generation https://arxiv.org/abs/1811.11459
Jupyter Notebook
18
star
53

Genie

Official Implementation of "Genie: Show Me the Data for Quantization" (CVPR 2023)
Python
17
star
54

blox

Macro Neural Architecture Search Benchmark
Python
16
star
55

StepFormer

Python
16
star
56

hierarchical-act

This supplementary code is for IROS 2024 paper "Hierarchical Action Chunk Transformer: Learning Temporal Multimodality from Demonstrations with Fast Imitation Behavior"
Python
14
star
57

Undiff

Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementary material for the paper accepted to the upcoming Interspeech2023 conference.
Python
14
star
58

EdgeViTs

[ECCV 2022] EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Python
13
star
59

genren

Implementation of 2D-3D Cyclic Generative Renderer (3DV-2020).
Python
13
star
60

awesomeyaml

Utility library to help parsing, transforming and querying yaml-based configs
Python
12
star
61

pyworkers

Abstraction over threading, multiprocessing and TCP-based RPC
Python
11
star
62

ShellRecontruction

[IROS 2022] Object Shell Reconstruction: Camera-centric Object Representation for Robotic Grasping
Python
11
star
63

StereoLayers

11
star
64

PALinux

In-Kernel Control-Flow Integrity on Commodity OSes using ARM Pointer Authentication
11
star
65

c2g-HOF

[ICRA 2021, IROS 2021] Cost-to-Go Function Generating Networks for High Dimensional Motion Planning
Python
11
star
66

two-camera-white-balance

Python
10
star
67

hole-robust-wf

Data and code for the WACV 2022 paper, "Hole-robust Wireframe Detection"
Python
10
star
68

video-retrieval-sampler

The official implementation for the paper 'mmSampler: Efficient Frame Sampler for Multimodal Video Retrieval'.
Python
9
star
69

ordered_dropout

Technique of Ordered Dropout as used in the paper "Fjord: Fair and accurate federated learning under heterogeneous targets with ordered dropout", NeurIPS'21
Jupyter Notebook
9
star
70

myQASR

Open source the codebase related to the paper: E. Fish, U. Michieli, M. Ozay, "A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization", 2023. The paper has been accepted for publication at the INTERSPEECH 2023 Conference.
Jupyter Notebook
8
star
71

FedorAS

FedorAS: Federated Architecture Search under system heterogeneity
Python
8
star
72

AdaCLIP

This repository contains the code for AdaCLIP, a computation and latency-aware system for pragmatic multimodal video retrieval.
Python
8
star
73

prime-count

This repository contains codes for Prime+Count paper.
C
7
star
74

appbuddy

Python
7
star
75

RIC

RIC: Rotate-Inpaint-Complete for Generalizable Scene Reconstruction
Python
7
star
76

X-MRS

Food image / recipe (text) cross-modal representation learning, retrieval and (image) synthesis. Code from ACM-Multimedia 2021 "Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning"
Python
7
star
77

FineControlNet-project-page

Project webpage of "FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection", 2023
JavaScript
7
star
78

RGBD-FGN

RGBD Fusion Grasp Network with Large-Scale Tableware Grasp Dataset
Python
6
star
79

smoke-bomb

SmokeBomb: Effective Mitigation Method against Cache Side-channel Attacks on the ARM Architecture
C
6
star
80

fastflow-tensorflow

A customized Tensorflow with partial offloading and profiling features for FastFlow project.
C++
5
star
81

NASR

Jupyter Notebook
5
star
82

Z-Fold

Official Implementation of "Z-Fold: A Frustratingly Easy Post-Training Quantization Scheme for LLMs" (EMNLP 2023)
Python
5
star
83

procedure-planning

Python
4
star
84

NAFLD

Two-dimensional convolutional neural network using quantitative US for non-invasive assessment of hepatic steatosis in NAFLD
Python
4
star
85

transpr

Python
4
star
86

ExpandersPruning

This respository contains the code and experiments for the paper "Data-Free Model Pruning at Initialization via Expanders", appearing at the Efficient Deep Learning for Computer Vision CVPR Workshop, 2023. Authors: James Stewart, Umberto Michieli, and Mete Ozay.
Python
4
star
87

NB-MLM

Python
3
star
88

SAGE

Python
3
star
89

WatchYourSteps

3D scenes editing using NeRFs
Python
3
star
90

saic-is

Python
2
star
91

MotionID

Python
2
star
92

MoRF-project-page

JavaScript
2
star
93

Multitask-RFG

Code to reproduce experiments for End-to-end recipe flow graph parsing
Python
2
star
94

viola-project-page

Project webpage for "VioLA: Aligning Videos to 2D LiDAR Scans"
JavaScript
2
star
95

Metis

[ATC '24] Metis: Fast automatic distributed training on heterogeneous GPUs (https://www.usenix.org/conference/atc24/presentation/um)
1
star
96

SwissDINO

Code release of our paper: "Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search"; Kirill Paramonov, Jia-Xing Zhong, Umberto Michieli, Jijoong Moon, Mete Ozay; IROS 2024.
Python
1
star
97

iiTransformer

Code for "iiTransformer: A Unified Approach to Exploiting Local and Non-Local Information for Image Restoration" (Kang et al., BMVC 2022)
Python
1
star
98

FROST

Codebase release for our accepted paper at ICASSP 2024.
1
star
99

HandNeRF-project-page

Project webpage of "HandNeRF: Learning to Reconstruct Hand-Object Interaction Scene from a Single RGB Image", ICRA 2024
JavaScript
1
star
100

HIO-SDF-project-page

Project page for "HIO-SDF: Hierarchical Incremental Online Signed Distance Fields"
JavaScript
1
star