• Stars
    star
    138
  • Rank 264,508 (Top 6 %)
  • Language
    Python
  • License
    Other
  • Created almost 2 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICIP2023] TR3D: Towards Real-Time Indoor 3D Object Detection

PWC PWC PWC

TR3D: Towards Real-Time Indoor 3D Object Detection

News:

  • 🔥 June, 2023. TR3D is accepted at ICIP2023.
  • 🚀 June, 2023. We add ScanNet-pretrained S3DIS model and log significantly pushing forward state-of-the-art.
  • February, 2023. TR3D on all 3 datasets is now supported in mmdetection3d as a project.
  • 🔥 February, 2023. TR3D is now state-of-the-art on paperswithcode on SUN RGB-D and S3DIS.

This repository contains an implementation of TR3D, a 3D object detection method introduced in our paper:

TR3D: Towards Real-Time Indoor 3D Object Detection
Danila Rukhovich, Anna Vorontsova, Anton Konushin
Samsung Research
https://arxiv.org/abs/2302.02858

Installation

For convenience, we provide a Dockerfile.

Alternatively, you can install all required packages manually. This implementation is based on mmdetection3d framework. Please refer to the original installation guide getting_started.md, including MinkowskiEngine installation, replacing open-mmlab/mmdetection3d with samsunglabs/tr3d.

Most of the TR3D-related code locates in the following files: detectors/mink_single_stage.py, detectors/tr3d_ff.py, dense_heads/tr3d_head.py, necks/tr3d_neck.py.

Getting Started

Please see getting_started.md for basic usage examples. We follow the mmdetection3d data preparation protocol described in scannet, sunrgbd, and s3dis.

Training

To start training, run train with TR3D configs:

python tools/train.py configs/tr3d/tr3d_scannet-3d-18class.py

Testing

Test pre-trained model using test with TR3D configs:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP

Visualization

Visualizations can be created with test script. For better visualizations, you may set score_thr in configs to 0.3:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP --show \
    --show-dir work_dirs/tr3d_scannet-3d-18class

Models

The metrics are obtained in 5 training runs followed by 5 test runs. We report both the best and the average values (the latter are given in round brackets). Inference speed (scenes per second) is measured on a single NVidia RTX 4090. Please, note that ScanNet-pretrained S3DIS model was actually trained in the original openmmlab/mmdetection3d codebase.

TR3D 3D Detection

Dataset [email protected] [email protected] Scenes
per sec.
Download
ScanNet 72.9 (72.0) 59.3 (57.4) 23.7 model | log | config
SUN RGB-D 67.1 (66.3) 50.4 (49.6) 27.5 model | log | config
S3DIS 74.5 (72.1) 51.7 (47.6) 21.0 model | log | config
S3DIS
ScanNet-pretrained
75.9 (75.1) 56.6 (54.8) 21.0 model | log | config

RGB + PC 3D Detection on SUN RGB-D

Model [email protected] [email protected] Scenes
per sec.
Download
ImVoteNet 63.4 - 14.8 instruction
VoteNet+FF 64.5 (63.7) 39.2 (38.1) - model | log | config
TR3D+FF 69.4 (68.7) 53.4 (52.4) 17.5 model | log | config

Example Detections

drawing

Citation

If you find this work useful for your research, please cite our paper:

@misc{rukhovich2023tr3d,
  doi = {10.48550/ARXIV.2302.02858},
  url = {https://arxiv.org/abs/2302.02858},
  author = {Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
  title = {TR3D: Towards Real-Time Indoor 3D Object Detection},
  publisher = {arXiv},
  year = {2023}
}

More Repositories

1

ritm_interactive_segmentation

Reviving Iterative Training with Mask Guidance for Interactive Segmentation
Python
622
star
2

fbrs_interactive_segmentation

[CVPR2020] f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation https://arxiv.org/abs/2001.10331
Jupyter Notebook
581
star
3

NeuralHaircut

Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction. ICCV 2023
Python
510
star
4

rome

Realistic mesh-based avatars. ECCV 2022
Python
424
star
5

adaptis

[ICCV19] AdaptIS: Adaptive Instance Selection Network, https://arxiv.org/abs/1909.07829
Jupyter Notebook
335
star
6

imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
Python
271
star
7

image_harmonization

[WACV2021] Foreground-aware Semantic Representations for Image Harmonization https://arxiv.org/abs/2006.00809
Python
266
star
8

pytorch-ensembles

Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning, ICLR 2020
Jupyter Notebook
236
star
9

fcaf3d

[ECCV2022] FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
Python
223
star
10

iterdet

[S+SSPR2020] IterDet: Iterative Scheme for Object Detection in Crowded Environments
Python
206
star
11

FineControlNet

Official Pytorch Implementation of "FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection", 2023
Python
177
star
12

SPIn-NeRF

3D Scene Inpainting with NeRFs
Jupyter Notebook
167
star
13

TwiTi

This is a project of "#Twiti: Social Listening for Threat Intelligence" (TheWebConf 2021)
Python
167
star
14

zero-cost-nas

Zero-Cost Proxies for Lightweight NAS
Jupyter Notebook
141
star
15

BayesDLL

Python
141
star
16

ASAM

Implementation of ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks, ICML 2021.
Python
138
star
17

MLI

Novel View Synthesis with multiplane/multilayer representation: CVPR2022, WACV2023
Python
136
star
18

td3d

[WACV'24] TD3D: Top-Down Beats Bottom-Up in 3D Instance Segmentation
Python
131
star
19

day-to-night

Python
106
star
20

saic_depth_completion

Official implementation of "Decoder Modulation for Indoor Depth Completion" https://arxiv.org/abs/2005.08607
Python
105
star
21

Butterfly_Acc

The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design"
Verilog
103
star
22

DINAR

Inference code for "DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars"
Python
98
star
23

tqc_pytorch

Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/
Python
87
star
24

SummaryMixing

This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
Python
86
star
25

style-people

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper
Python
72
star
26

MTL

Python
71
star
27

RAMP

[IROS 2023] RAMP: Hierarchical Reactive Motion Planning for Manipulation Tasks Using Implicit Signed Distance Functions
Python
51
star
28

ffc_se

Code for the paper "FFC-SE: Fast Fourier Convolution for Speech Enhancement" (published at Interspeech 2022 conference)
Python
48
star
29

hifi_plusplus

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement (ICASSP 2023)
Python
47
star
30

deep-weight-prior

The Deep Weight Prior, ICLR 2019
Jupyter Notebook
44
star
31

odometry

Training Deep SLAM on Single Frames https://arxiv.org/abs/1912.05405
Python
43
star
32

eagle

Measuring and predicting on-device metrics (latency, power, etc.) of machine learning models
Python
42
star
33

point_based_clothing

Official PyTorch implementation of ICCV'21 paper Point-Based Modeling of Human Clothing
Jupyter Notebook
41
star
34

HandNeRF

Official Pytorch Implementation of "HandNeRF: Learning to Reconstruct Hand-Object Interaction Scene from a Single RGB Image", ICRA 2024
Python
39
star
35

HIO-SDF

[ICRA 2024] HIO-SDF: Hierarchical Incremental Online Signed Distance Fields
Python
39
star
36

gps-augment

Simple but high-performing method for learning a policy of test-time augmentation
Jupyter Notebook
38
star
37

Noise2NoiseFlow

Python
36
star
38

cloud_transformers

[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks https://arxiv.org/abs/2007.11679
Python
33
star
39

SceneGrasp

[IROS 2023] Real-time Simultaneous Multi-Object 3D Shape Reconstruction, 6DoF Pose Estimation and Dense Grasp Prediction
Python
32
star
40

Sparse-Multi-DNN-Scheduling

Open-source artifacts and codes of our MICRO'23 paper titled “Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads”.
Python
32
star
41

Drop-DTW

Python
30
star
42

ltmnet

Learning Tone Curves for Local Image Enhancement
Python
30
star
43

semi-supervised-NFs

Code for the paper Semi-Conditional Normalizing Flows for Semi-Supervised Learning
Python
28
star
44

W2E

This is a project of "Cybersecurity Event Detection with New and Re-emerging Words". (ASIACCS 2020)
28
star
45

FastFlow

FastFlow is a system that automatically detects CPU bottlenecks in deep learning training pipelines and resolves the bottlenecks with data pipeline offloading to remote resources .
Python
25
star
46

geometry-preserving-de

Towards General Purpose, Geometry Preserving Single-View Depth Estimation https://arxiv.org/abs/2009.12419
Python
22
star
47

neural-textures

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.
Python
22
star
48

graphics2raw

Code associated with our paper "Graphics2RAW: Mapping Computer Graphics Images to Sensor RAW Images". The paper has been accepted to the International Conference on Computer Vision (ICCV'23).
Python
22
star
49

nb-asr

Python
21
star
50

FACaP

[IROS 2022]Floorplan-Aware Camera Poses Refinement
Python
21
star
51

content-aware-metadata

Python
20
star
52

coordinate_based_inpainting

[CVPR2019] Coordinate-based texture inpainting for pose-guided human image generation https://arxiv.org/abs/1811.11459
Jupyter Notebook
18
star
53

Genie

Official Implementation of "Genie: Show Me the Data for Quantization" (CVPR 2023)
Python
17
star
54

blox

Macro Neural Architecture Search Benchmark
Python
16
star
55

StepFormer

Python
16
star
56

hierarchical-act

This supplementary code is for IROS 2024 paper "Hierarchical Action Chunk Transformer: Learning Temporal Multimodality from Demonstrations with Fast Imitation Behavior"
Python
14
star
57

Undiff

Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementary material for the paper accepted to the upcoming Interspeech2023 conference.
Python
14
star
58

EdgeViTs

[ECCV 2022] EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Python
13
star
59

genren

Implementation of 2D-3D Cyclic Generative Renderer (3DV-2020).
Python
13
star
60

awesomeyaml

Utility library to help parsing, transforming and querying yaml-based configs
Python
12
star
61

pyworkers

Abstraction over threading, multiprocessing and TCP-based RPC
Python
11
star
62

ShellRecontruction

[IROS 2022] Object Shell Reconstruction: Camera-centric Object Representation for Robotic Grasping
Python
11
star
63

StereoLayers

11
star
64

PALinux

In-Kernel Control-Flow Integrity on Commodity OSes using ARM Pointer Authentication
11
star
65

c2g-HOF

[ICRA 2021, IROS 2021] Cost-to-Go Function Generating Networks for High Dimensional Motion Planning
Python
11
star
66

two-camera-white-balance

Python
10
star
67

hole-robust-wf

Data and code for the WACV 2022 paper, "Hole-robust Wireframe Detection"
Python
10
star
68

video-retrieval-sampler

The official implementation for the paper 'mmSampler: Efficient Frame Sampler for Multimodal Video Retrieval'.
Python
9
star
69

ordered_dropout

Technique of Ordered Dropout as used in the paper "Fjord: Fair and accurate federated learning under heterogeneous targets with ordered dropout", NeurIPS'21
Jupyter Notebook
9
star
70

myQASR

Open source the codebase related to the paper: E. Fish, U. Michieli, M. Ozay, "A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization", 2023. The paper has been accepted for publication at the INTERSPEECH 2023 Conference.
Jupyter Notebook
8
star
71

FedorAS

FedorAS: Federated Architecture Search under system heterogeneity
Python
8
star
72

AdaCLIP

This repository contains the code for AdaCLIP, a computation and latency-aware system for pragmatic multimodal video retrieval.
Python
8
star
73

prime-count

This repository contains codes for Prime+Count paper.
C
7
star
74

appbuddy

Python
7
star
75

RIC

RIC: Rotate-Inpaint-Complete for Generalizable Scene Reconstruction
Python
7
star
76

X-MRS

Food image / recipe (text) cross-modal representation learning, retrieval and (image) synthesis. Code from ACM-Multimedia 2021 "Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning"
Python
7
star
77

FineControlNet-project-page

Project webpage of "FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection", 2023
JavaScript
7
star
78

RGBD-FGN

RGBD Fusion Grasp Network with Large-Scale Tableware Grasp Dataset
Python
6
star
79

smoke-bomb

SmokeBomb: Effective Mitigation Method against Cache Side-channel Attacks on the ARM Architecture
C
6
star
80

fastflow-tensorflow

A customized Tensorflow with partial offloading and profiling features for FastFlow project.
C++
5
star
81

NASR

Jupyter Notebook
5
star
82

Z-Fold

Official Implementation of "Z-Fold: A Frustratingly Easy Post-Training Quantization Scheme for LLMs" (EMNLP 2023)
Python
5
star
83

procedure-planning

Python
4
star
84

NAFLD

Two-dimensional convolutional neural network using quantitative US for non-invasive assessment of hepatic steatosis in NAFLD
Python
4
star
85

transpr

Python
4
star
86

ExpandersPruning

This respository contains the code and experiments for the paper "Data-Free Model Pruning at Initialization via Expanders", appearing at the Efficient Deep Learning for Computer Vision CVPR Workshop, 2023. Authors: James Stewart, Umberto Michieli, and Mete Ozay.
Python
4
star
87

NB-MLM

Python
3
star
88

SAGE

Python
3
star
89

WatchYourSteps

3D scenes editing using NeRFs
Python
3
star
90

saic-is

Python
2
star
91

MotionID

Python
2
star
92

MoRF-project-page

JavaScript
2
star
93

Multitask-RFG

Code to reproduce experiments for End-to-end recipe flow graph parsing
Python
2
star
94

viola-project-page

Project webpage for "VioLA: Aligning Videos to 2D LiDAR Scans"
JavaScript
2
star
95

Metis

[ATC '24] Metis: Fast automatic distributed training on heterogeneous GPUs (https://www.usenix.org/conference/atc24/presentation/um)
1
star
96

SwissDINO

Code release of our paper: "Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search"; Kirill Paramonov, Jia-Xing Zhong, Umberto Michieli, Jijoong Moon, Mete Ozay; IROS 2024.
Python
1
star
97

iiTransformer

Code for "iiTransformer: A Unified Approach to Exploiting Local and Non-Local Information for Image Restoration" (Kang et al., BMVC 2022)
Python
1
star
98

FROST

Codebase release for our accepted paper at ICASSP 2024.
1
star
99

HandNeRF-project-page

Project webpage of "HandNeRF: Learning to Reconstruct Hand-Object Interaction Scene from a Single RGB Image", ICRA 2024
JavaScript
1
star
100

HIO-SDF-project-page

Project page for "HIO-SDF: Hierarchical Incremental Online Signed Distance Fields"
JavaScript
1
star