• Stars
    star
    384
  • Rank 108,134 (Top 3 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation

introduction

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation
Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, Hujun Bao
CVPR 2019 oral
Project Page

Any questions or discussions are welcomed!

Introduction

Thanks Haotong Lin for providing the clean version of PVNet and reproducing the results.

The structure of this project is described in project_structure.md.

Installation

One way is to set up the environment with docker. See this.

Thanks Floris Gaisser for providing the docker implementation.

Another way is to use the following commands.

  1. Set up the python environment:
    conda create -n pvnet python=3.7
    conda activate pvnet
    
    # install torch 1.1 built from cuda 9.0
    pip install torch==1.1.0 -f https://download.pytorch.org/whl/cu90/stable
    
    pip install Cython==0.28.2
    sudo apt-get install libglfw3-dev libglfw3
    pip install -r requirements.txt
    
  2. Compile cuda extensions under lib/csrc:
    ROOT=/path/to/clean-pvnet
    cd $ROOT/lib/csrc
    export CUDA_HOME="/usr/local/cuda-9.0"
    cd ransac_voting
    python setup.py build_ext --inplace
    cd ../nn
    python setup.py build_ext --inplace
    cd ../fps
    python setup.py build_ext --inplace
    
    # If you want to run PVNet with a detector
    cd ../dcn_v2
    python setup.py build_ext --inplace
    
    # If you want to use the uncertainty-driven PnP
    cd ../uncertainty_pnp
    sudo apt-get install libgoogle-glog-dev
    sudo apt-get install libsuitesparse-dev
    sudo apt-get install libatlas-base-dev
    python setup.py build_ext --inplace
    
  3. Set up datasets:
    ROOT=/path/to/clean-pvnet
    cd $ROOT/data
    ln -s /path/to/linemod linemod
    ln -s /path/to/linemod_orig linemod_orig
    ln -s /path/to/occlusion_linemod occlusion_linemod
    
    # the following is used for tless
    ln -s /path/to/tless tless
    ln -s /path/to/cache cache
    ln -s /path/to/SUN2012pascalformat sun
    

Download datasets which are formatted for this project:

  1. linemod
  2. linemod_orig: The dataset includes the depth for each image.
  3. occlusion linemod
  4. truncation linemod: Check TRUNCATION_LINEMOD.md for the information about the Truncation LINEMOD dataset.
  5. Tless: cat tlessa* | tar xvf - -C ..
  6. Tless cache data: It is used for training and testing on Tless.
  7. SUN2012pascalformat

Testing

Testing on Linemod

We provide the pretrained models of objects on Linemod, which can be found at here.

Take the testing on cat as an example.

  1. Prepare the data related to cat:
    python run.py --type linemod cls_type cat
    
  2. Download the pretrained model of cat and put it to $ROOT/data/model/pvnet/cat/199.pth.
  3. Test:
    python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat
    python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat
    
  4. Test with icp:
    python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat test.icp True
    python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat test.icp True
    
  5. Test with the uncertainty-driven PnP:
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./lib/csrc/uncertainty_pnp/lib
    python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat test.un_pnp True
    python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat test.un_pnp True
    

Testing on Tless

We provide the pretrained models of objects on Tless, which can be found at here.

  1. Download the pretrained models and put them to $ROOT/data/model/pvnet/.
  2. Test:
    python run.py --type evaluate --cfg_file configs/tless/tless_01.yaml
    # or
    python run.py --type evaluate --cfg_file configs/tless/tless_01.yaml test.vsd True
    

Visualization

Demo

python run.py --type demo --cfg_file configs/linemod.yaml demo_path demo_images/cat

Visualization on Linemod

Take the cat as an example.

  1. Prepare the data related to cat:
    python run.py --type linemod cls_type cat
    
  2. Download the pretrained model of cat and put it to $ROOT/data/model/pvnet/cat/199.pth.
  3. Visualize:
    python run.py --type visualize --cfg_file configs/linemod.yaml model cat cls_type cat
    

If setup correctly, the output will look like

cat

  1. Visualize with a detector:

    Download the pretrained models here and put them to $ROOT/data/model/pvnet/pvnet_cat/59.pth and $ROOT/data/model/ct/ct_cat/9.pth

    python run.py --type detector_pvnet --cfg_file configs/ct_linemod.yaml
    

Visualization on Tless

Visualize:

python run.py --type visualize --cfg_file configs/tless/tless_01.yaml
# or
python run.py --type visualize --cfg_file configs/tless/tless_01.yaml test.det_gt True

Training

Training on Linemod

  1. Prepare the data related to cat:
    python run.py --type linemod cls_type cat
    
  2. Train:
    python train_net.py --cfg_file configs/linemod.yaml model mycat cls_type cat
    

The training parameters can be found in project_structure.md.

Training on Tless

Train:

python train_net.py --cfg_file configs/tless/tless_01.yaml

Tensorboard

tensorboard --logdir data/record/pvnet

If setup correctly, the output will look like

tensorboard

Training on the custom object

An example dataset can be downloaded at here.

  1. Create a dataset using https://github.com/F2Wang/ObjectDatasetTools
  2. Organize the dataset as the following structure:
    β”œβ”€β”€ /path/to/dataset
    β”‚   β”œβ”€β”€ model.ply
    β”‚   β”œβ”€β”€ camera.txt
    β”‚   β”œβ”€β”€ diameter.txt  // the object diameter, whose unit is meter
    β”‚   β”œβ”€β”€ rgb/
    β”‚   β”‚   β”œβ”€β”€ 0.jpg
    β”‚   β”‚   β”œβ”€β”€ ...
    β”‚   β”‚   β”œβ”€β”€ 1234.jpg
    β”‚   β”‚   β”œβ”€β”€ ...
    β”‚   β”œβ”€β”€ mask/
    β”‚   β”‚   β”œβ”€β”€ 0.png
    β”‚   β”‚   β”œβ”€β”€ ...
    β”‚   β”‚   β”œβ”€β”€ 1234.png
    β”‚   β”‚   β”œβ”€β”€ ...
    β”‚   β”œβ”€β”€ pose/
    β”‚   β”‚   β”œβ”€β”€ pose0.npy
    β”‚   β”‚   β”œβ”€β”€ ...
    β”‚   β”‚   β”œβ”€β”€ pose1234.npy
    β”‚   β”‚   β”œβ”€β”€ ...
    β”‚   β”‚   └──
    
  3. Create a soft link pointing to the dataset:
    ln -s /path/to/custom_dataset data/custom
    
  4. Process the dataset:
    python run.py --type custom
    
  5. Train:
    python train_net.py --cfg_file configs/custom.yaml train.batch_size 4
    
  6. Watch the training curve:
    tensorboard --logdir data/record/pvnet
    
  7. Visualize:
    python run.py --type visualize --cfg_file configs/custom.yaml
    
  8. Test:
    python run.py --type evaluate --cfg_file configs/custom.yaml
    

An example dataset can be downloaded at here.

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{peng2019pvnet,
  title={PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation},
  author={Peng, Sida and Liu, Yuan and Huang, Qixing and Zhou, Xiaowei and Bao, Hujun},
  booktitle={CVPR},
  year={2019}
}

Acknowledgement

This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.

Copyright (c) ZJU-SenseTime Joint Lab of 3D Vision. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

More Repositories

1

EasyMocap

Make human motion capture easier.
Python
3,279
star
2

LoFTR

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021, T-PAMI 2022
Jupyter Notebook
2,054
star
3

NeuralRecon

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral
Python
1,913
star
4

4K4D

[CVPR 2024] 4K4D: Real-Time 4D View Synthesis at 4K Resolution
Python
1,417
star
5

snake

Code for "Deep Snake for Real-Time Instance Segmentation" CVPR 2020 oral
Jupyter Notebook
1,142
star
6

OnePose

Code for "OnePose: One-Shot Object Pose Estimation without CAD Models", CVPR 2022
Python
903
star
7

neuralbody

Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate
Python
897
star
8

pvnet

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral
Jupyter Notebook
788
star
9

NeuralRecon-W

Code for "Neural 3D Reconstruction in the Wild", SIGGRAPH 2022 (Conference Proceedings)
Python
681
star
10

street_gaussians

Code for "Street Gaussians for Modeling Dynamic Urban Scenes"
576
star
11

mvpose

Code for "Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views" (CVPR 2019, T-PAMI 2021)
Jupyter Notebook
504
star
12

animatable_nerf

Code for "Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos" TPAMI 2024, ICCV 2021
Python
488
star
13

manhattan_sdf

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral
Python
482
star
14

EasyVolcap

[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research
Python
461
star
15

ENeRF

SIGGRAPH Asia 2022: Code for "Efficient Neural Radiance Fields for Interactive Free-viewpoint Video"
Python
400
star
16

DetectorFreeSfM

Code for "Detector-Free Structure from Motion", Arxiv Preprint
393
star
17

NeuMesh

Code for "MeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing", ECCV 2022 Oral
Python
374
star
18

AutoRecon

Code for "AutoRecon: Automated 3D Object Discovery and Reconstruction" CVPR 2023 (Highlight)
Python
341
star
19

OnePose_Plus_Plus

Code for "OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD Models" NeurIPS 2022
Python
329
star
20

object_nerf

Code for "Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering", ICCV 2021
Python
306
star
21

PVIO

Robust and Efficient Visual-Inertial Odometry with Multi-plane Priors
C++
298
star
22

Vox-Fusion

Code for "Dense Tracking and Mapping with Voxel-based Neural Implicit Representation", ISMAR 2022
Python
257
star
23

EfficientLoFTR

Jupyter Notebook
251
star
24

ENFT-SfM

This source code provides a reference implementation for ENFT-SfM.
C++
250
star
25

Wis3D

A web-based 3D visualization tool for 3D computer vision.
TypeScript
248
star
26

SMAP

Code for "SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation" (ECCV 2020)
Python
237
star
27

mlp_maps

Code for "Representing Volumetric Videos as Dynamic MLP Maps" CVPR 2023
Cuda
230
star
28

im4d

SIGGRAPH Asia 2023: Code for "Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes"
Python
226
star
29

disprcnn

Code release for Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation (CVPR 2020, TPAMI 2021)
Jupyter Notebook
211
star
30

PVO

code for "PVO: Panoptic Visual Odometry", CVPR 2023
Python
198
star
31

GIFT

Code for "GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs" NeurIPS 2019
Python
190
star
32

Mirrored-Human

Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror" (CVPR 2021 Oral)
184
star
33

pvnet-rendering

render images for pvnet training
Python
177
star
34

IntrinsicNeRF

code for "IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis", ICCV 2023
Python
174
star
35

InvRender

Code for "Modeling Indirect Illumination for Inverse Rendering", CVPR 2022
Python
165
star
36

EIBA

Efficient Incremental BA
C++
161
star
37

instant-nvr

[CVPR 2023] Code for "Learning Neural Volumetric Representations of Dynamic Humans in Minutes"
Python
144
star
38

Monocular_3D_human

137
star
39

eval-vislam

Toolkit for VI-SLAM evaluation.
C++
137
star
40

SINE

Code for "Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field", CVPR 2023
Python
123
star
41

rnin-vio

Python
116
star
42

deltar

Code for "DELTAR: Depth Estimation from a Light-weight ToF Sensor And RGB Image", ECCV 2022
Python
112
star
43

NeuSC

A Temporal Voyage: Code for "Neural Scene Chronology" [CVPR 2023]
Python
111
star
44

DeFlowSLAM

code for "DeFlowSLAM: Self-Supervised Scene Motion Decomposition for Dynamic Dense SLAM"
109
star
45

SegmentBA

Segment based Bundle Adjustment
C++
108
star
46

CoLi-BA

C++
107
star
47

iMoCap

dataset for ECCV 2020 "Motion Capture from Internet Videos"
Python
104
star
48

VS-Net

VS-Net: Voting with Segmentation for Visual Localization
Python
86
star
49

UDOLO

Python
84
star
50

pats

Code for "PATS: Patch Area Transportation with Subdivision for Local Feature Matching", CVPR 2023
C++
84
star
51

SA-HMR

Code for "Learning Human Mesh Recovery in 3D Scenes" CVPR 2023
Python
79
star
52

ENFT

Efficient Non-Consecutive Feature Tracking for Robust SfM http://www.zjucvg.net/ls-acts/ls-acts.html
C++
76
star
53

TotalSelfScan

Code for "TotalSelfScan: Learning Full-body Avatars from Self-Portrait Videos of Faces, Hands, and Bodies" (NeurIPS 2022)
Python
73
star
54

SAM-Graph

Code for "SAM-guided Graph Cut for 3D Instance Segmentation"
69
star
55

gcasp

[CoRL 2022] Generative Category-Level Shape and Pose Estimation with Semantic Primitives
Python
66
star
56

GeneAvatar

Code for "GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image", CVPR 2024
59
star
57

zju3dv.github.io

HTML
57
star
58

vig-init

Rapid and Robust Monocular Visual-Inertial Initialization with Gravity Estimation via Vertical Edges
C++
56
star
59

coxgraph

Code for "Coxgraph: Multi-Robot Collaborative, Globally Consistent, Online Dense Reconstruction System", IROS 2021 Best Paper Award Finalist on Safety, Security, and Rescue Robotics in memory of Motohiro Kisoi
C++
54
star
60

RVL-Dynamic

Code for "Prior Guided Dropout for Robust Visual Localization in Dynamic Environments" in ICCV 2019
Python
47
star
61

Vox-Surf

Code for "Vox-Surf: Voxel-based Implicit Surface Representation", TVCG 2022
Python
46
star
62

NIID-Net

Code for "NIID-Net: Adapting Surface Normal Knowledge for Intrinsic Image Decomposition in Indoor Scenes" TVCG
Python
43
star
63

hghoi

ICCV 2023, Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models
C++
43
star
64

RLP_VIO

Code for "RLP-VIO: Robust and lightweight plane-based visual-inertial odometry for augmented reality, CAVW 2022
C++
42
star
65

Mirror-NeRF

Code for "Mirror-NeRF: Learning Neural Radiance Fields for Mirrors with Whitted-Style Ray Tracing", ACM MM 2023
Python
37
star
66

AutoDecomp

3D object discovery from casual object captures
HTML
36
star
67

RelightableAvatar

[CVPR 2024 (Highlight)] Relightable and Animatable Neural Avatar from Sparse-View Video
Python
35
star
68

CloseMoCap

Official implementation of "Reconstructing Close Human Interaction from Multiple Views"
33
star
69

poking_perception

Python
29
star
70

MagLoc-AR

14
star
71

MVN-AFM

Code for "Multi-View Neural 3D Reconstruction of Micro-/Nanostructures with Atomic Force Microscopy"
Python
11
star
72

blink_sim

11
star
73

pvnet-depth-sup

10
star
74

hybrid3d

C++
10
star
75

nr_in_a_room

Code for "Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects", ACM ToG
Python
10
star
76

RNNPose

RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization, CVPR 2022
6
star
77

rnin-vio.github.io

CSS
2
star
78

LSFB

1
star