Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates

News

[2021/04/07] This repository has been replaced by the DEKR(CVPR2021) repository: (https://github.com/HRNet/DEKR).

Introduction

In this work, We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance. First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression. Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance to further improve keypoint regression quality. Last, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses. Together with the tradeoff heatmap estimation loss for balancing the background and keypoint pixels and thus improving heatmap estimation quality, we get the state-of-the-art bottom-up human pose estimation result.

Main Results

Results on COCO val2017 without multi-scale test

Backbone	Input size	#Params	GFLOPs	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
pose_hrnet_w32	512x512	30.8M	64.9	0.678	0.868	0.740	0.620	0.764	0.723	0.898	0.776	0.656	0.820
pose_hrnet_w48	640x640	67.0M	174.3	0.701	0.881	0.760	0.656	0.772	0.748	0.913	0.798	0.692	0.829
pose_higher_hrnet_w48	640x640	67.0M	181.9	0.713	0.884	0.770	0.675	0.773	0.758	0.916	0.810	0.709	0.831

Results on COCO val2017 with multi-scale test

Backbone	Input size	#Params	GFLOPs	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
pose_hrnet_w32	512x512	30.8M	64.9	0.707	0.880	0.769	0.661	0.777	0.758	0.919	0.812	0.702	0.838
pose_hrnet_w48	640x640	67.0M	174.3	0.725	0.889	0.787	0.689	0.782	0.777	0.929	0.832	0.728	0.847
pose_higher_hrnet_w48	640x640	67.0M	181.9	0.729	0.892	0.788	0.693	0.785	0.782	0.931	0.834	0.732	0.854

Results on COCO test-dev2017 without multi-scale test

Backbone	Input size	#Params	GFLOPs	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
pose_hrnet_w32	512x512	30.8M	64.9	0.666	0.878	0.728	0.611	0.745	0.714	0.908	0.770	0.646	0.808
pose_hrnet_w48	640x640	67.0M	174.3	0.694	0.889	0.762	0.649	0.757	0.743	0.921	0.801	0.685	0.822
pose_higher_hrnet_w48	640x640	67.0M	181.9	0.702	0.895	0.773	0.665	0.756	0.751	0.926	0.811	0.701	0.821

Results on COCO test-dev2017 with multi-scale test

Backbone	Input size	#Params	GFLOPs	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
pose_hrnet_w32	512x512	30.8M	64.9	0.694	0.889	0.762	0.649	0.758	0.749	0.928	0.810	0.691	0.829
pose_hrnet_w48	640x640	67.0M	174.3	0.714	0.898	0.783	0.678	0.768	0.769	0.937	0.830	0.717	0.841
pose_higher_hrnet_w48	640x640	67.0M	181.9	0.718	0.902	0.787	0.683	0.768	0.774	0.941	0.835	0.724	0.843

Results on CrowdPose test without multi-scale test

Method	AP	Ap .5	AP .75	AP (E)	AP (M)	AP (H)
pose_hrnet_w32	64.9	84.5	69.6	72.7	65.5	56.1
pose_hrnet_w48	66.1	84.6	71.2	73.4	66.9	57.1
pose_higher_hrnet_w48	66.2	84.9	71.4	73.6	67.0	57.6

Results on CrowdPose test with multi-scale test

Method	AP	Ap .5	AP .75	AP (E)	AP (M)	AP (H)
pose_hrnet_w32	67.5	86.1	72.6	75.5	68.2	58.2
pose_hrnet_w48	68.2	85.7	73.4	75.9	69.0	58.9
pose_higher_hrnet_w48	68.2	86.2	73.6	75.8	69.1	59.1

Note:

Flip test is used.
GFLOPs is for convolution and linear layers only.

Environment

The code is developed using python 3.6 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 4 NVIDIA V100 GPU cards for HRNet-w32 and 8 NVIDIA V100 GPU cards for HRNet-w48. Other platforms are not fully tested.

Quick start

Installation

Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.
Install dependencies:
```
pip install -r requirements.txt
```

Install COCOAPI:

# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python3 setup.py install --user

Note that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.

Install CrowdPoseAPI exactly the same as COCOAPI.
- There is a bug in the CrowdPoseAPI, please reverse https://github.com/Jeff-sjtu/CrowdPose/commit/785e70d269a554b2ba29daf137354103221f479e
Build dcn model:
```
python setup.py develop
```

Init output(training model output directory) and log(tensorboard log directory) directory:

mkdir output 
mkdir log

Your directory tree should look like this:

${POSE_ROOT}
├── data
├── model
├── experiments
├── lib
├── tools 
├── log
├── output
├── README.md
├── requirements.txt
└── setup.py

Download pretrained models and our well-trained models from zoo(OneDrive) and make models directory look like this:

${POSE_ROOT}
|-- model
`-- |-- imagenet
    |   |-- hrnet_w32-36af842e.pth
    |   `-- hrnetv2_w48_imagenet_pretrained.pth
    |-- pose_coco
    |   |-- pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.pth
    |   |-- pose_hrnet_w48_reg_delaysep_bg01_stn_640_adam_lr1e-3_coco_x140.pth
    |   `-- pose_higher_hrnet_w48_reg_delaysep_bg01_0025_stn_640_adam_lr1e-3_coco_x140.pth
    |-- pose_crowdpose
    |   |-- pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.pth
    |   |-- pose_hrnet_w48_reg_delaysep_bg01_stn_640_adam_lr1e-3_crowdpose_x300.pth
    |   `-- pose_higher_hrnet_w48_reg_delaysep_bg01_0025_stn_640_adam_lr1e-3_crowdpose_x300.pth
    `-- rescore
        |-- final_rescore_coco_kpt.pth
        `-- final_rescore_crowd_pose_kpt.pth

Data preparation

For COCO data, please download from COCO download, 2017 Train/Val is needed for COCO keypoints training and validation. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- person_keypoints_train2017.json
        |   `-- person_keypoints_val2017.json
        `-- images
            |-- train2017.zip
            `-- val2017.zip

For CrowdPose data, please download from CrowdPose download, Train/Val is needed for CrowdPose keypoints training. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- crowdpose
    `-- |-- json
        |   |-- crowdpose_train.json
        |   |-- crowdpose_val.json
        |   |-- crowdpose_trainval.json (generated by tools/crowdpose_concat_train_val.py)
        |   `-- crowdpose_test.json
        `-- images.zip

After downloading data, run python tools/crowdpose_concat_train_val.py under ${POSE_ROOT} to create trainval set.

For learning to score data, you can generate your train data using your model following this command: Get the train data using COCO train2017/Crowdpose trainval set.

python tools/rescore_data.py \
    --cfg your_config_file(experiments/coco/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.yaml) \
    TEST.MODEL_FILE your_model_file(model/pose_coco/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.pth) \
    DATASET.TEST train2017 \
    DATASET.DATASET_TEST cocoscore \ 
    DATASET.GET_RESCORE_DATA True \
    RESCORE.USE False

python tools/rescore_data.py \
    --cfg your_config_file(experiments/crowdpose/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.yaml) \
    TEST.MODEL_FILE your_model_file(model/pose_crowdpose/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.pth) \
    DATASET.TEST trainval \
    DATASET.DATASET_TEST crowdposescore \ 
    DATASET.GET_RESCORE_DATA True \
    RESCORE.USE False

Note:

The model trained using data generated by one model can work on other models also.

Training and Testing

Testing on COCO val2017 dataset without multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.pth

Testing on COCO val2017 dataset with multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.pth \ 
    TEST.SCALE_FACTOR 0.5,1,2

Testing on crowdpose test dataset without multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE models/pose_crowdpose/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.pth

Testing on crowdpose test dataset with multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE models/pose_crowdpose/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.pth \ 
    TEST.SCALE_FACTOR 0.5,1,2

Training on COCO train2017 dataset

python tools/train.py \
    --cfg experiments/coco/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.yaml \

Training on Crowdpose trainval dataset

python tools/train.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.yaml \

Training your rescore model and test it

python tools/rescore_train.py --cfg experiments/crowdpose/rescore_crowdpose.yaml 
python tools/rescore_train.py --cfg experiments/coco/rescore_coco.yaml

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE model/pose_coco/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.pth \
    RESCORE.MODEL_FILE model/rescore/final_rescore_coco_kpt.pth
python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE model/pose_crowdpose/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.pth \
    RESCORE.MODEL_FILE model/rescore/final_rescore_crowd_pose_kpt.pth

Using inference demo

python tools/inference_demo.py --cfg experiments/inference_demo_coco.yaml \
    --videoFile ../multi_people.mp4 \
    --outputDir output \
    --visthre 0.3 \
    TEST.MODEL_FILE model/pose_coco/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_coco_x140.pth
python tools/inference_demo.py --cfg experiments/inference_demo_crowdpose.yaml \
    --videoFile ../multi_people.mp4 \
    --outputDir output \
    --visthre 0.3 \
    TEST.MODEL_FILE model/pose_crowdpose/pose_hrnet_w32_reg_delaysep_bg01_stn_512_adam_lr1e-3_crowdpose_x300.pth \

The above command will create a video under output directory and a lot of pose image under output/pose directory.

Acknowledge

Our code is mainly based on HigherHRNet.

We adopted dcn (deformable convolution network) implemented in MMDetection.

Citation

@article{SunGMXLZW20,
  title={Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates},
  author={Ke Sun, Zigang Geng, Depu Meng, Bin Xiao, Dong Liu, Zhaoxiang Zhang, Jingdong Wang},
  journal={arXiv preprint arXiv:},
  year={2020}
}

@inproceedings{SunXLW19,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
  booktitle={CVPR},
  year={2019}
}

@article{WangSCJDZLMTWLX19,
  title={Deep High-Resolution Representation Learning for Visual Recognition},
  author={Jingdong Wang and Ke Sun and Tianheng Cheng and 
          Borui Jiang and Chaorui Deng and Yang Zhao and Dong Liu and Yadong Mu and 
          Mingkui Tan and Xinggang Wang and Wenyu Liu and Bin Xiao},
  journal={TPAMI}
  year={2019}
}

HRNet/HRNet-Bottom-Up-Pose-Estimation

HRNet

Reviews

Repository Details

Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates

News

Introduction

Main Results

Results on COCO val2017 without multi-scale test

Results on COCO val2017 with multi-scale test

Results on COCO test-dev2017 without multi-scale test

Results on COCO test-dev2017 with multi-scale test

Results on CrowdPose test without multi-scale test

Results on CrowdPose test with multi-scale test

Note:

Environment

Quick start

Installation

Data preparation

Note:

Training and Testing

Testing on COCO val2017 dataset without multi-scale test using well-trained pose model

Testing on COCO val2017 dataset with multi-scale test using well-trained pose model

Testing on crowdpose test dataset without multi-scale test using well-trained pose model

Testing on crowdpose test dataset with multi-scale test using well-trained pose model

Training on COCO train2017 dataset

Training on Crowdpose trainval dataset

Training your rescore model and test it

Using inference demo

Acknowledge

Citation

More Repositories