• Stars
    star
    457
  • Rank 95,175 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 5 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

RootNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image"

Introduction

This repo is official PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019). It contains RootNet part.

What this repo provides:

Dependencies

This code is tested under Ubuntu 16.04, CUDA 9.0, cuDNN 7.1 environment with two NVIDIA 1080Ti GPUs.

Python 3.6.5 version with Anaconda 3 is used for development.

Quick demo

You can try quick demo at demo folder.

  • Download the pre-trained RootNet in here.
  • Prepare input.jpg and pre-trained snapshot at demo folder.
  • Set bbox_list at here.
  • Run python demo.py --gpu 0 --test_epoch 18 if you want to run on gpu 0.
  • You can see output_root_2d.jpg and printed root joint depths.

Directory

Root

The ${POSE_ROOT} is described as below.

${POSE_ROOT}
|-- data
|-- demo
|-- common
|-- main
|-- output
  • data contains data loading codes and soft links to images and annotations directories.
  • demo contains demo codes.
  • common contains kernel codes for 3d multi-person pose estimation system.
  • main contains high-level codes for training or testing the network.
  • output contains log, trained models, visualized outputs, and test result.

Data

You need to follow directory structure of the data as below.

${POSE_ROOT}
|-- data
|   |-- Human36M
|   |   |-- bbox
|   |   |   |-- bbox_human36m_output.json
|   |   |-- images
|   |   |-- annotations
|   |-- MPII
|   |   |-- images
|   |   |-- annotations
|   |-- MSCOCO
|   |   |-- images
|   |   |   |-- train2017
|   |   |   |-- val2017
|   |   |-- annotations
|   |-- MuCo
|   |   |-- data
|   |   |   |-- augmented_set
|   |   |   |-- unaugmented_set
|   |   |   |-- MuCo-3DHP.json
|   |-- MuPoTS
|   |   |-- bbox
|   |   |   |-- bbox_mupots_output.json
|   |   |-- data
|   |   |   |-- MultiPersonTestSet
|   |   |   |-- MuPoTS-3D.json
|   |-- PW3D
|   |   |-- data
|   |   |   |-- 3DPW_train.json
|   |   |   |-- 3DPW_validation.json
|   |   |   |-- 3DPW_test.json
|   |   |-- imageFiles

To download multiple files from Google drive without compressing them, try this. If you have a problem with 'Download limit' problem when tried to download dataset from google drive link, please try this trick.

* Go the shared folder, which contains files you want to copy to your drive  
* Select all the files you want to copy  
* In the upper right corner click on three vertical dots and select “make a copy”  
* Then, the file is copied to your personal google drive account. You can download it from your personal account.  

Output

You need to follow the directory structure of the output folder as below.

${POSE_ROOT}
|-- output
|-- |-- log
|-- |-- model_dump
|-- |-- result
|-- |-- vis
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

Running 3DMPPE_ROOTNET

Start

  • In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.
  • YOU MUST SET bbox_real according to unit of each dataset. For example, Human3.6M uses milimeter, therefore bbox_real = (2000, 2000). 3DPW uses meter, therefore bbox_real = (2, 2).

Train

In the main folder, run

python train.py --gpu 0-1

to train the network on the GPU 0,1.

If you want to continue experiment, run

python train.py --gpu 0-1 --continue

--gpu 0,1 can be used instead of --gpu 0-1.

Test

Place trained model at the output/model_dump/.

In the main folder, run

python test.py --gpu 0-1 --test_epoch 20

to test the network on the GPU 0,1 with 20th epoch trained model. --gpu 0,1 can be used instead of --gpu 0-1.

Results

  • Pre-trained model of RootNet in here.
  • Bounding boxs (from DetectNet and not extended) of Human3.6M and MuPoTS-3D datasets in here. You can use this to test RootNet.
  • Bounding boxs (from DetectNet and extended) and root joint coordinates (from RootNet) of Human3.6M, MSCOCO, and MuPoTS-3D datasets in here. You should not use the bounding boxs of this file to test RootNet because the boxs are extended. Please use the right above one (bounding boxs from DetectNet without bbox extension).
  • Bounding boxs (GT) and root joint coordinates (from RootNet) of 3DPW dataset (only test set) in here. The result is obtained from RootNet trained on MuCo+MSCOCO (without 3DPW training set).

For the evaluation, you can run test.py or there are evaluation codes in Human36M and MuPoTS.

Human3.6M dataset using protocol 2 (milimeter)

Method MRPE MRPE_x MRPE_y MRPE_z
RootNet 120.0 23.3 23.0 108.1

MuPoTS-3D dataset (percentage)

Method AP_25
RootNet 31.0

3DPW dataset (test set. meter)

Method MRPE MRPE_x MRPE_y MRPE_z
RootNet 0.386 0.045 0.094 0.353

MSCOCO dataset

We additionally provide estimated 3D human root coordinates in on the MSCOCO dataset. The coordinates are in 3D camera coordinate system, and focal lengths are set to 1500mm for both x and y axis. You can change focal length and corresponding distance using equation 2 or equation in supplementarial material of my paper.

Reference

@InProceedings{Moon_2019_ICCV_3DMPPE,
author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
title = {Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image},
booktitle = {The IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}

More Repositories

1

3DMPPE_POSENET_RELEASE

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019
Python
763
star
2

I2L-MeshNet_RELEASE

Official PyTorch implementation of "I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image", ECCV 2020
Python
685
star
3

V2V-PoseNet_RELEASE

Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018
MATLAB
359
star
4

TF-SimpleHumanPose

TensorFlow implementation of "Simple Baselines for Human Pose Estimation and Tracking", ECCV 2018
Python
335
star
5

PoseFix_RELEASE

Official TensorFlow implementation of "PoseFix: Model-agnostic General Human Pose Refinement Network", CVPR 2019
Python
325
star
6

Hand4Whole_RELEASE

Official PyTorch implementation of "Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation", CVPRW 2022 (Oral.)
Python
246
star
7

Integral-Human-Pose-Regression-for-3D-Human-Pose-Estimation

PyTorch implementation of "Integral Human Pose Regression", ECCV 2018
Python
192
star
8

NeuralAnnot_RELEASE

3D Pseudo-GTs of "NeuralAnnot: Neural Annotator for 3D Human Mesh Training Sets", CVPRW 2022 Oral.
Python
133
star
9

A-Convolutional-Neural-Network-Cascade-for-Face-Detection

TensorFlow implementation of "A Convolutional Neural Network Cascade for Face Detection", CVPR 2015
Python
108
star
10

IntegralAction_RELEASE

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021 (Oral.)
Python
31
star
11

Fast-Feature-Pyramids-for-Object-Detection

undergraduate work
C++
20
star
12

Accurate-Image-Super-Resolution-Using-Very-Deep-Convolutional-Networks

Lua
15
star
13

Filtered-Channel-Features-for-Pedestrian-Detection

undergraduate work
C++
6
star