• Stars
    star
    104
  • Rank 330,538 (Top 7 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 4 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Single-Stage 6D Object Pose Estimation, CVPR 2020

Overview

This repository contains the code for the paper Single-Stage 6D Object Pose Estimation. Yinlin Hu, Pascal Fua, Wei Wang and Mathieu Salzmann. CVPR 2020.

Most recent 6D pose estimation frameworks first rely on a deep network to establish correspondences between 3D object keypoints and 2D image locations and then use a variant of a RANSAC-based Perspective-n-Point (PnP) algorithm. This two-stage process, however, is suboptimal: First, it is not end-to-end trainable. Second, training the deep network relies on a surrogate loss that does not directly reflect the final 6D pose estimation task.

In this work, we introduce a deep architecture that directly regresses 6D poses from correspondences. It takes as input a group of candidate correspondences for each 3D keypoint and accounts for the fact that the order of the correspondences within each group is irrelevant, while the order of the groups, that is, of the 3D keypoints, is fixed. Our architecture is generic and can thus be exploited in conjunction with existing correspondence-extraction networks so as to yield single-stage 6D pose estimation frameworks. Our experiments demonstrate that these single-stage frameworks consistently outperform their two-stage counterparts in terms of both accuracy and speed.

Figure 1: Motivation. Consider the modern 6D pose estimation algorithm of SegDrivenPose that uses a deep network to predict several 2D correspondences for each of the eight 3D corners of the pitcher's bounding box. (a) Because it minimizes the average 2D error of these correspondences, two instances of such a framework could produce correspondences that differ but have the same average accuracy, such as the green and the red ones. As evidenced by the projected green and red reference frames, applying a RANSAC-based PnP algorithm to these two sets of correspondences can yield substantially different poses. (b) Even when using only the set of green correspondences, simply changing their order causes a RANSAC-based PnP algorithm to return different solutions.

Figure 2: Overall architecture for single-stage 6D object pose estimation. After establishing 3D-to-2D correspondences by some segmentation-driven CNN for 6D pose, we use three main modules to infer the pose from these correspondence clusters directly: a local feature extraction module with shared network parameters, a feature aggregation module operating within the different clusters, and a global inference module consisting of simple fully-connected layers to estimate the final pose as a quaternion and a translation. The color in the CNN outputs indicates the direction of the 2D offset from the grid cell center to the corresponding projected 3D bounding box corner.

How to Use

This repository contains the code for the core network inferring pose from correspondences. It is straightforward to merge with other correspondence-extraction networks SegDrivenPose or PVNet to obtain an end-to-end 6D pose framework.

Citing

@inproceedings{hu2020singlestagepose,
  title={Single-Stage 6D Object Pose Estimation},
  author={Yinlin Hu and Pascal Fua and Wei Wang and Mathieu Salzmann},
  booktitle={CVPR},
  year={2020}
}

More Repositories

1

gaussian-splatting-web

TypeScript
536
star
2

LIFT

Code release for the ECCV 2016 paper
Python
485
star
3

disk

Disk code release
Python
303
star
4

EPnP

EPnP: Efficient Perspective-n-Point Camera Pose Estimation
MATLAB
263
star
5

MeshSDF

Code for "MeshSDF: Differentiable Iso-Surface Extraction", NeurIPS2020, SpotLight
Python
220
star
6

tf-lift

Tensorflow port of LIFT (ECCV 2016), with training code.
Python
196
star
7

segmentation-driven-pose

Segmentation-driven 6D Object Pose Estimation. CVPR 2019.
Python
184
star
8

MeshUDF

Fast and Differentiable Meshing of Unsigned Distance Field Networks
Cython
136
star
9

voxel2mesh

Voxel2Mesh: 3D Mesh Model Generation from Volumetric Data
Python
113
star
10

sketch2mesh

Reconstructing and Editing 3D Shapes from Sketches
Python
78
star
11

Power-Iteration-SVD

Backpropagation-Friendly-Eigendecomposition
Python
72
star
12

pyKSP

This is a Python wrapper for the K-Shortest Path tracking algorithm.
C++
66
star
13

social-scene-understanding

Source code for the CVPR 2017 paper
Python
63
star
14

wide-depth-range-pose

Wide-Depth-Range 6D Object Pose Estimation in Space, CVPR 2021
Python
61
star
15

log-polar-descriptors

Public implementation of "Beyond Cartesian Representations for Local Descriptors", ICCV 2019
Jupyter Notebook
60
star
16

detecting-the-unexpected

Detecting the Unexpected via Image Resynthesis
Python
56
star
17

balltracking

Tracking of the ball and the players in team sports
MATLAB
46
star
18

perspective-flow-aggregation

Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation. ECCV 2022.
Python
38
star
19

LabelGrab

Annotation tool for semantic and instance segmentation, with automated help from the GrabCut implemented in OpenCV.
Python
28
star
20

densecrf

a fork of the densecrf package implementing alternative inference scheme
C++
27
star
21

multiview_calib

Single and multiple view camera calibration tool
Jupyter Notebook
26
star
22

deepdesc-release

Code for the ICCV 2015 paper "Discriminative Learning of Deep Convolutional Feature Point Descriptors"
Lua
25
star
23

adv_param_pose_prior

Adversarial Parametric Pose Prior
Python
23
star
24

multicam-gt

Our Webapp to annotate multi-camera pedestrian detection datasets.
JavaScript
20
star
25

diff-nrsfm

MATLAB
18
star
26

cvlab-kubernetes-guide

Instructions and utilities for use of EPFL's compute cluster.
Python
15
star
27

MVFlow

Python
13
star
28

iter_unc

Official code for "Enabling Uncertainty Estimation in Iterative Neural Networks" (ICML 2024)
Jupyter Notebook
12
star
29

gecco

Code release for GECCO: Geometrically-Conditioned Point Diffusion Models
Python
11
star
30

zigzag

Official code for "ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference" (TMLR 2024)
Jupyter Notebook
10
star
31

PA-net

Probabilistic Atlases to Enforce Topological Constraints
Python
9
star
32

mot3d

Fast Single View and Multiview Multi Object Tracking Using Minimum Cost Maximum Flow Formulation
Jupyter Notebook
7
star
33

MVAug

Python
7
star
34

erasing-road-obstacles

Detecting Road Obstacles by Erasing Them
6
star
35

UCLID-Net

Implementation of UCLID-Net (NeurIPS 2020)
Python
5
star
36

n-queens-benchmark

C++
3
star
37

mf-mrf

Parallel mean-field inference web page
HTML
2
star
38

MARMOT

Multi-Aspect Reconstruction and Multi-Object Tracking
Jupyter Notebook
1
star
39

MAGE

Multi-Aspect Groundplane Estimation
Python
1
star
40

UDA-Hand-Object

Unsupervised Domain Adaptation with Temporal Consistency for 3D Joint Hand-Object Reconstruction
Python
1
star