• Stars
    star
    184
  • Rank 209,144 (Top 5 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 5 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Segmentation-driven 6D Object Pose Estimation. CVPR 2019.

Overview

Please find the latest version at WDR-Pose.

This repository contains the code for the paper Segmentation-driven 6D Object Pose Estimation. Yinlin Hu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann. CVPR. 2019. [Paper]

The most recent trend in estimating the 6D pose of rigid objects has been to train deep networks to either directly regress the pose from the image or to predict the 2D locations of 3D keypoints, from which the pose can be obtained using a PnP algorithm. In both cases, the object is treated as a global entity, and a single pose estimate is computed. As a consequence, the resulting techniques can be vulnerable to large occlusions.

In this paper, we introduce a segmentation-driven 6D pose estimation framework where each visible part of the objects contributes a local pose prediction in the form of 2D keypoint locations. We then use a predicted measure of confidence to combine these pose candidates into a robust set of 3D-to-2D correspondences, from which a reliable pose estimate can be obtained. We outperform the state-of-the-art on the challenging Occluded-LINEMOD and YCB-Video datasets, which is evidence that our approach deals well with multiple poorly-textured objects occluding each other. Furthermore, it relies on a simple enough architecture to achieve real-time performance.

Figure 1: Overall workflow of our method. Our architecture has two streams: One for object segmentation and the other to regress 2D keypoint locations. These two streams share a common encoder, but the decoders are separate. Each one produces a tensor of a spatial resolution that defines an SxS grid over the image. The segmentation stream predicts the label of the object observed at each grid location. The regression stream predicts the 2D keypoint locations for that object.

Figure 2: Occluded-LINEMOD results. In each column, we show, from top to bottom: the foreground segmentation mask, all 2D reprojection candidates, the selected 2D reprojections, and the final pose results. Our method generates accurate pose estimates, even in the presence of large occlusions. Furthermore, it can process multiple objects in real time.

How to Use

Step 1

Download the datasets.

Occluded-LINEMOD: https://hci.iwr.uni-heidelberg.de/vislearn/iccv2015-occlusion-challenge/

YCB-Video: https://rse-lab.cs.washington.edu/projects/posecnn/

Step 2

Download the pretrained model.

Occluded-LINEMOD: https://1drv.ms/u/s!ApOY_gOHw8hLbbdmVZgnqk30I5A

YCB-Video: https://1drv.ms/u/s!ApOY_gOHw8hLbLl4i8CAXD6LGuU

Download and put them into ./model directory.

Due to commercial problem, we can only provide the code for inference. However, it is straightforward to implement the training part according to our paper and this repository.

Step 3

Prepare the input file list using gen_filelist.py.

Step 4

Run test.py and explore it.

Citing

@inproceedings{hu2019segpose,
  title={Segmentation-driven 6D Object Pose Estimation},
  author={Yinlin Hu and Joachim Hugonot and Pascal Fua and Mathieu Salzmann},
  booktitle={CVPR},
  year={2019}
}

More Repositories

1

gaussian-splatting-web

TypeScript
536
star
2

LIFT

Code release for the ECCV 2016 paper
Python
485
star
3

disk

Disk code release
Python
303
star
4

EPnP

EPnP: Efficient Perspective-n-Point Camera Pose Estimation
MATLAB
263
star
5

MeshSDF

Code for "MeshSDF: Differentiable Iso-Surface Extraction", NeurIPS2020, SpotLight
Python
220
star
6

tf-lift

Tensorflow port of LIFT (ECCV 2016), with training code.
Python
196
star
7

MeshUDF

Fast and Differentiable Meshing of Unsigned Distance Field Networks
Cython
136
star
8

voxel2mesh

Voxel2Mesh: 3D Mesh Model Generation from Volumetric Data
Python
113
star
9

single-stage-pose

Single-Stage 6D Object Pose Estimation, CVPR 2020
Python
104
star
10

sketch2mesh

Reconstructing and Editing 3D Shapes from Sketches
Python
78
star
11

Power-Iteration-SVD

Backpropagation-Friendly-Eigendecomposition
Python
72
star
12

pyKSP

This is a Python wrapper for the K-Shortest Path tracking algorithm.
C++
66
star
13

social-scene-understanding

Source code for the CVPR 2017 paper
Python
63
star
14

wide-depth-range-pose

Wide-Depth-Range 6D Object Pose Estimation in Space, CVPR 2021
Python
61
star
15

log-polar-descriptors

Public implementation of "Beyond Cartesian Representations for Local Descriptors", ICCV 2019
Jupyter Notebook
60
star
16

detecting-the-unexpected

Detecting the Unexpected via Image Resynthesis
Python
56
star
17

balltracking

Tracking of the ball and the players in team sports
MATLAB
46
star
18

perspective-flow-aggregation

Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation. ECCV 2022.
Python
38
star
19

LabelGrab

Annotation tool for semantic and instance segmentation, with automated help from the GrabCut implemented in OpenCV.
Python
28
star
20

densecrf

a fork of the densecrf package implementing alternative inference scheme
C++
27
star
21

multiview_calib

Single and multiple view camera calibration tool
Jupyter Notebook
26
star
22

deepdesc-release

Code for the ICCV 2015 paper "Discriminative Learning of Deep Convolutional Feature Point Descriptors"
Lua
25
star
23

adv_param_pose_prior

Adversarial Parametric Pose Prior
Python
23
star
24

multicam-gt

Our Webapp to annotate multi-camera pedestrian detection datasets.
JavaScript
20
star
25

diff-nrsfm

MATLAB
18
star
26

cvlab-kubernetes-guide

Instructions and utilities for use of EPFL's compute cluster.
Python
15
star
27

MVFlow

Python
13
star
28

iter_unc

Official code for "Enabling Uncertainty Estimation in Iterative Neural Networks" (ICML 2024)
Jupyter Notebook
12
star
29

gecco

Code release for GECCO: Geometrically-Conditioned Point Diffusion Models
Python
11
star
30

zigzag

Official code for "ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference" (TMLR 2024)
Jupyter Notebook
10
star
31

PA-net

Probabilistic Atlases to Enforce Topological Constraints
Python
9
star
32

mot3d

Fast Single View and Multiview Multi Object Tracking Using Minimum Cost Maximum Flow Formulation
Jupyter Notebook
7
star
33

MVAug

Python
7
star
34

erasing-road-obstacles

Detecting Road Obstacles by Erasing Them
6
star
35

UCLID-Net

Implementation of UCLID-Net (NeurIPS 2020)
Python
5
star
36

n-queens-benchmark

C++
3
star
37

mf-mrf

Parallel mean-field inference web page
HTML
2
star
38

MARMOT

Multi-Aspect Reconstruction and Multi-Object Tracking
Jupyter Notebook
1
star
39

MAGE

Multi-Aspect Groundplane Estimation
Python
1
star
40

UDA-Hand-Object

Unsupervised Domain Adaptation with Temporal Consistency for 3D Joint Hand-Object Reconstruction
Python
1
star