• Stars
    star
    801
  • Rank 56,886 (Top 2 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 7 years ago
  • Updated about 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Network estimating 3D Handpose from single color images

ColorHandPose3D network

Teaser

ColorHandPose3D is a Convolutional Neural Network estimating 3D Hand Pose from a single RGB Image. See the project page for the dataset used and additional information.

Usage: Forward pass

The network ships with a minimal example, that performs a forward pass and shows the predictions.

  • Download data and unzip it into the projects root folder (This will create 3 folders: "data", "results" and "weights")
  • run.py - Will run a forward pass of the network on the provided examples

You can compare your results to the content of the folder "results", which shows the predictions we get on our system.

Recommended system

Recommended system (tested):

  • Ubuntu 16.04.2 (xenial)
  • Tensorflow 1.3.0 GPU build with CUDA 8.0.44 and CUDNN 5.1
  • Python 3.5.2

Python packages used by the example provided and their recommended version:

  • tensorflow==1.3.0
  • numpy==1.13.0
  • scipy==0.18.1
  • matplotlib==1.5.3

Preprocessing for training and evaluation

In order to use the training and evaluation scripts you need download and preprocess the datasets.

Rendered Hand Pose Dataset (RHD)

  • Download the dataset accompanying this publication RHD dataset v. 1.1

  • Set the variable 'path_to_db' to where the dataset is located on your machine

  • Optionally modify 'set' variable to training or evaluation

  • Run

      python3.5 create_binary_db.py
    
  • This will create a binary file in ./data/bin according to how 'set' was configured

Stereo Tracking Benchmark Dataset (STB)

  • For eval3d_full.py it is necessary to get the dataset presented in Zhang et al., ‘3d Hand Pose Tracking and Estimation Using Stereo Matching’, 2016

  • After unzipping the dataset run

      cd ./data/stb/
      matlab -nodesktop -nosplash -r "create_db"
    
  • This will create the binary file ./data/stb/stb_evaluation.bin

Network training

We provide scripts to train HandSegNet and PoseNet on the Rendered Hand Pose Dataset (RHD). In case you want to retrain the networks on new data you can adapt the code provided to your needs.

The following steps guide you through training HandSegNet and PoseNet on the Rendered Hand Pose Dataset (RHD).

  • Make sure you followed the steps in the section 'Preprocessing'
  • Start training of HandSegNet with training_handsegnet.py
  • Start training of PoseNet with training_posenet.py
  • Set USE_RETRAINED = True on line 32 in eval2d_gt_cropped.py
  • Run eval2d_gt_cropped.py to evaluate the retrained PoseNet on RHD-e
  • Set USE_RETRAINED = True on line 31 in eval2d.py
  • Run eval2d.py to evaluate the retrained HandSegNet + PoseNet on RHD-e

You should be able to obtain results that roughly match the following numbers we obtain with Tensorflow v1.3:

eval2d_gt_cropped.py yields:

Evaluation results:
Average mean EPE: 7.630 pixels
Average median EPE: 3.939 pixels
Area under curve: 0.771

eval2d.py yields:

Evaluation results:
Average mean EPE: 15.469 pixels
Average median EPE: 4.374 pixels
Area under curve: 0.715

Because training itself isn't a deterministic process results will differ between runs. Note that these results are not listed in the paper.

Evaluation

There are four scripts that evaluate different parts of the architecture:

  1. eval2d_gt_cropped.py: Evaluates PoseNet on 2D keypoint localization using ground truth annoation to create hand cropped images (section 6.1, Table 1 of the paper)
  2. eval2d.py: Evaluates HandSegNet and PoseNet on 2D keypoint localization (section 6.1, Table 1 of the paper)
  3. eval3d.py: Evaluates different approaches on lifting 2D predictions into 3D (section 6.2.1, Table 2 of the paper)
  4. eval3d_full.py: Evaluates our full pipeline on 3D keypoint localization from RGB (section 6.2.1, Table 2 of the paper)

This provides the possibility to reproduce results from the paper that are based on the RHD dataset.

License and Citation

This project is licensed under the terms of the GPL v2 license. By using the software, you are agreeing to the terms of the license agreement.

Please cite us in your publications if it helps your research:

@InProceedings{zb2017hand,
  author    = {Christian Zimmermann and Thomas Brox},
  title     = {Learning to Estimate 3D Hand Pose from Single RGB Images},
  booktitle    = "IEEE International Conference on Computer Vision (ICCV)",
  year      = {2017},
  note         = "https://arxiv.org/abs/1705.01389",
  url          = "https://lmb.informatik.uni-freiburg.de/projects/hand3d/"
}

Known issues

  • There is an issue with the results of section 6.1, Table 1 that reports performance of 2D keypoint localization on full scale images (eval2d.py). PoseNet was trained to predict the "palm center", but the evaluation script compares to the "wrist". This results into an systematic error and therefore the reported results are significantly worse than under a correct evaluation setting. Using the correct setting during evaluation improves results approximately by 2-10% (dependent on the measure).
  • The numbers reported for the "Bottleneck" approach in Table 2 of the paper are not correct. The actual result are approx. 8 % worse.
  • There is a minor issue with the first version of RHD. There was a rounding/casting problem, which led to values of the images to be off by one every now and then compared to the version used in the paper. The difference is visually not noticable and not large, but it prevents from reaching the reported numbers exactly.

More Repositories

1

flownet2

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
C++
1,004
star
2

demon

DeMoN: Depth and Motion Network
Python
574
star
3

freihand

A dataset for estimation of hand pose and shape from single color images.
Python
382
star
4

deeptam

DeepTAM: Deep Tracking and Mapping https://lmb.informatik.uni-freiburg.de/people/zhouh/deeptam/
Python
233
star
5

mv3d

Multi-view 3D Models from Single Images with a Convolutional Network
Python
214
star
6

rgbd-pose3d

3D Human Pose Estimation in RGBD Images for Robotic Task Learning
Python
198
star
7

flownet2-docker

Dockerfile and runscripts for FlowNet 2.0 (estimation of optical flow)
Shell
158
star
8

netdef_models

Repository for different network models related to flow/disparity (ECCV 18)
Python
157
star
9

ogn

Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs
C++
155
star
10

orion

ORION: Orientation-boosted Voxel Nets for 3D Object Recognition
MATLAB
111
star
11

what3d

What Do Single-view 3D Reconstruction Networks Learn?
Python
98
star
12

dispnet-flownet-docker

Dockerfile and runscripts for DispNet and FlowNet1 (estimation of disparity and optical flow)
Shell
87
star
13

Unet-Segmentation

The U-Net Segmentation plugin for Fiji (ImageJ)
Java
87
star
14

robustmvd

Repository for the Robust Multi-View Depth Benchmark
Python
74
star
15

contra-hand

Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'
Python
53
star
16

Multimodal-Future-Prediction

The official repository for the CVPR 2019 paper "Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction"
Python
47
star
17

lmbspecialops

A collection of tensorflow ops
C++
46
star
18

FLN-EPN-RPN

This repository contains the source code of the CVPR 2020 paper: "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View with a Reachability Prior"
Python
32
star
19

flow_rl

Python
28
star
20

netdef-docker

DispNet3, FlowNet3, FlowNetH, SceneFlowNet -- in Docker
Shell
28
star
21

caffe-unet-docker

The U-Net Segmentation server (caffe_unet) for Docker
Shell
27
star
22

Contrastive-Future-Trajectory-Prediction

The official repository of the ICCV paper "On Exposing the Challenging Long Tail in Future Prediction of Traffic Actors"
Python
25
star
23

locov

Localized Vision-Language Matching for Open-vocabulary Object Detection
Python
19
star
24

unsup-car-dataset

Unsupervised Generation of a Viewpoint Annotated Car Dataset from Videos
MATLAB
19
star
25

FreiPose-docker

FreiPose: A Deep Learning Framework for Precise Animal Motion Capture in 3D Spaces
Dockerfile
18
star
26

optical-flow-2d-data-generation

Caffe(v1)-compatible codebase to generate optical flow training data on-the-fly; used for the IJCV 2018 paper "What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation?" (http://dx.doi.org/10.1007/s11263-018-1082-6)
C++
18
star
27

autodispnet

Code for AutoDispNet (ICCV 2019)
Python
17
star
28

cv-exercises

Python
15
star
29

spr-exercises

Jupyter Notebook
12
star
30

td-or-not-td

Code for the paper "TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning", Artemij Amiranashvili, Alexey Dosovitskiy, Vladlen Koltun and Thomas Brox, ICLR 2018
Python
12
star
31

sf2se3

Repository for SF2SE3: Clustering Scene Flow into SE(3)-Motions via Proposal and Selection
Python
10
star
32

ovqa

Python
10
star
33

understanding_flow_robustness

Official repository for "Towards Understanding Adversarial Robustness of Optical Flow Networks" (CVPR 2022)
Python
9
star
34

neural-point-cloud-diffusion

Official repository for "Neural Point Cloud Diffusion for Disentangled 3D Shape and Appearance Generation"
Python
9
star
35

ldce

Official repository for "Latent Diffusion Counterfactual Explanations"
Python
9
star
36

PreFAct

Code and Models for the paper "Learning Representations for Predicting Future Activities"
8
star
37

ROS-packages

A collection of ROS packages for LMB software; DispNet(1+3), FlowNet2, etc.
C++
7
star
38

FreiPose

C++
7
star
39

diffusion-for-ood

Official repository for "Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond". Coming soon.
Python
5
star
40

tfutils

tfutils is a set of tools for training networks with tensorflow
Python
5
star
41

FreiCalib

C++
5
star
42

netdef_slim

A python wrapper for tf to ease creation of network definitions.
Python
4
star
43

iRoCS-Toolbox

n-D Image Analysis libraries and tools
C++
4
star
44

rohl

Python
3
star
45

RecordTool

Python
2
star
46

tree-planting

Official repository for "Climate-sensitive Urban Planning Through Optimization of Tree Placements"
Python
2
star
47

ade-ood

Official repo for the ADE-OoD benchmark.
Python
1
star