• Stars
    star
    927
  • Rank 49,282 (Top 1.0 %)
  • Language
    Python
  • License
    Other
  • Created over 1 year ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"

SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation

Teaser

Useful links

[Homepage] Β Β Β Β  [arXiv] Β Β Β Β  [Video] Β Β Β Β  [MMHuman3D]

News

  • [2023-10-23] Support visualization through SMPL-X mesh overlay and add inference docker.
  • [2023-10-02] arXiv preprint is online!
  • [2023-09-28] Homepage and Video are online!
  • [2023-07-19] Pretrained models are released.
  • [2023-06-15] Training and testing code is released.

Gallery

001.gif 001.gif 001.gif
001.gif 001.gif 001.gif

Visualization

Install

conda create -n smplerx python=3.8 -y
conda activate smplerx
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch -y
pip install mmcv-full==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html
pip install -r requirements.txt

# install mmpose
cd main/transformer_utils
pip install -v -e .
cd ../..

Docker Support (Early Stage)

docker pull wcwcw/smplerx_inference:v0.2
docker run  --gpus all -v <vid_input_folder>:/smplerx_inference/vid_input \
        -v <vid_output_folder>:/smplerx_inference/vid_output \
        wcwcw/smplerx_inference:v0.2 --vid <video_name>.mp4
# Currently any customization need to be applied to /smplerx_inference/smplerx/inference_docker.py
  • We recently developed a docker for inference at docker hub.
  • This docker image uses SMPLer-X-H32 as inference baseline and was tested at RTX3090 & WSL2 (Ubuntu 20.04).

Pretrained Models

Model Backbone #Datasets #Inst. #Params MPE Download FPS
SMPLer-X-S32 ViT-S 32 4.5M 32M 82.6 model 36.17
SMPLer-X-B32 ViT-B 32 4.5M 103M 74.3 model 33.09
SMPLer-X-L32 ViT-L 32 4.5M 327M 66.2 model 24.44
SMPLer-X-H32 ViT-H 32 4.5M 662M 63.0 model 17.47
  • MPE (Mean Primary Error): the average of the primary errors on five benchmarks (AGORA, EgoBody, UBody, 3DPW, and EHF)
  • FPS (Frames Per Second): the average inference speed on a single Tesla V100 GPU, batch size = 1

Preparation

The file structure should be like:

SMPLer-X/
β”œβ”€β”€ common/
β”‚   └── utils/
β”‚       └── human_model_files/  # body model
β”‚           β”œβ”€β”€ smpl/
β”‚           β”‚   β”œβ”€β”€SMPL_NEUTRAL.pkl
β”‚           β”‚   β”œβ”€β”€SMPL_MALE.pkl
β”‚           β”‚   └──SMPL_FEMALE.pkl
β”‚           └── smplx/
β”‚               β”œβ”€β”€MANO_SMPLX_vertex_ids.pkl
β”‚               β”œβ”€β”€SMPL-X__FLAME_vertex_ids.npy
β”‚               β”œβ”€β”€SMPLX_NEUTRAL.pkl
β”‚               β”œβ”€β”€SMPLX_to_J14.pkl
β”‚               β”œβ”€β”€SMPLX_NEUTRAL.npz
β”‚               β”œβ”€β”€SMPLX_MALE.npz
β”‚               └──SMPLX_FEMALE.npz
β”œβ”€β”€ data/
β”œβ”€β”€ main/
β”œβ”€β”€ demo/  
β”‚   β”œβ”€β”€ videos/       
β”‚   β”œβ”€β”€ images/      
β”‚   └── results/ 
β”œβ”€β”€ pretrained_models/  # pretrained ViT-Pose, SMPLer_X and mmdet models
β”‚   β”œβ”€β”€ mmdet/
β”‚   β”‚   β”œβ”€β”€faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
β”‚   β”‚   └──mmdet_faster_rcnn_r50_fpn_coco.py
β”‚   β”œβ”€β”€ smpler_x_s32.pth.tar
β”‚   β”œβ”€β”€ smpler_x_b32.pth.tar
β”‚   β”œβ”€β”€ smpler_x_l32.pth.tar
β”‚   β”œβ”€β”€ smpler_x_h32.pth.tar
β”‚   β”œβ”€β”€ vitpose_small.pth
β”‚   β”œβ”€β”€ vitpose_base.pth
β”‚   β”œβ”€β”€ vitpose_large.pth
β”‚   └── vitpose_huge.pth
└── dataset/  
    β”œβ”€β”€ AGORA/       
    β”œβ”€β”€ ARCTIC/      
    β”œβ”€β”€ BEDLAM/      
    β”œβ”€β”€ Behave/      
    β”œβ”€β”€ CHI3D/       
    β”œβ”€β”€ CrowdPose/   
    β”œβ”€β”€ EgoBody/     
    β”œβ”€β”€ EHF/         
    β”œβ”€β”€ FIT3D/                
    β”œβ”€β”€ GTA_Human2/           
    β”œβ”€β”€ Human36M/             
    β”œβ”€β”€ HumanSC3D/            
    β”œβ”€β”€ InstaVariety/         
    β”œβ”€β”€ LSPET/                
    β”œβ”€β”€ MPII/                 
    β”œβ”€β”€ MPI_INF_3DHP/         
    β”œβ”€β”€ MSCOCO/               
    β”œβ”€β”€ MTP/                    
    β”œβ”€β”€ MuCo/                   
    β”œβ”€β”€ OCHuman/                
    β”œβ”€β”€ PoseTrack/                
    β”œβ”€β”€ PROX/                   
    β”œβ”€β”€ PW3D/                   
    β”œβ”€β”€ RenBody/
    β”œβ”€β”€ RICH/
    β”œβ”€β”€ SPEC/
    β”œβ”€β”€ SSP3D/
    β”œβ”€β”€ SynBody/
    β”œβ”€β”€ Talkshow/
    β”œβ”€β”€ UBody/
    β”œβ”€β”€ UP3D/
    └── preprocessed_datasets/  # HumanData files

Inference

  • Place the video for inference under SMPLer-X/demo/videos
  • Prepare the pretrained models to be used for inference under SMPLer-X/pretrained_models
  • Prepare the mmdet pretrained model and config under SMPLer-X/pretrained_models
  • Inference output will be saved in SMPLer-X/demo/results
cd main
sh slurm_inference.sh {VIDEO_FILE} {FORMAT} {FPS} {PRETRAINED_CKPT} 

# For inferencing test_video.mp4 (24FPS) with smpler_x_h32
sh slurm_inference.sh test_video mp4 24 smpler_x_h32

2D Smplx Overlay

We provide a lightweight visualization script for mesh overlay based on pyrender.

  • Use ffmpeg to split video into images
  • The visualization script takes inference results (see above) as the input.
ffmpeg -i {VIDEO_FILE} -f image2 -vf fps=30 \
        {SMPLERX INFERENCE DIR}/{VIDEO NAME (no extension)}/orig_img/%06d.jpg \
        -hide_banner  -loglevel error

cd main && python render.py \
            --data_path {SMPLERX INFERENCE DIR} --seq {VIDEO NAME} \
            --image_path {SMPLERX INFERENCE DIR}/{VIDEO NAME} \
            --render_biggest_person False

Training

cd main
sh slurm_train.sh {JOB_NAME} {NUM_GPU} {CONFIG_FILE}

# For training SMPLer-X-H32 with 16 GPUS
sh slurm_train.sh smpler_x_h32 16 config_smpler_x_h32.py
  • CONFIG_FILE is the file name under SMPLer-X/main/config
  • Logs and checkpoints will be saved to SMPLer-X/output/train_{JOB_NAME}_{DATE_TIME}

Testing

# To eval the model ../output/{TRAIN_OUTPUT_DIR}/model_dump/snapshot_{CKPT_ID}.pth.tar 
# with confing ../output/{TRAIN_OUTPUT_DIR}/code/config_base.py
cd main
sh slurm_test.sh {JOB_NAME} {NUM_GPU} {TRAIN_OUTPUT_DIR} {CKPT_ID}
  • NUM_GPU = 1 is recommended for testing
  • Logs and results will be saved to SMPLer-X/output/test_{JOB_NAME}_ep{CKPT_ID}_{TEST_DATSET}

FAQ

  • RuntimeError: Subtraction, the '-' operator, with a bool tensor is not supported. If you are trying to invert a mask, use the '~' or 'logical_not()' operator instead.

    Follow this post and modify torchgeometry

  • KeyError: 'SinePositionalEncoding is already registered in position encoding' or any other similar KeyErrors due to duplicate module registration.

    Manually add force=True to respective module registration under main/transformer_utils/mmpose/models/utils, e.g. @POSITIONAL_ENCODING.register_module(force=True) in this file

  • How do I animate my virtual characters with SMPLer-X output (like that in the demo video)?

    • We are working on that, please stay tuned! Currently, this repo supports SMPL-X estimation and a simple visualization (overlay of SMPL-X vertices).

References