EfficientPose
Publicly accessible scalable single-person pose estimation as introduced in "EfficientPose: Scalable single-person pose estimation". We provide a simple intuitive interface for high-precision movement extraction from 2D images, videos, or directly from your webcamera.
NOTE: All data remains safely at your computer during use.
Live demo
1. Plug
Assuming you have Python (>= 3.7) and FFMPEG (>= 4.4) preinstalled, simply run:
pip install -r requirements.txt
2. Play
Say the magical two words:
python track.py
Explore
Did I forget to mention flexibility? Indeed there is!
You are provided with these options (which go seamlessly hand in hand):
-
Path (--path, -p): Tell the program which file (i.e., video or image) you want to analyze. Ignore this option for camera-based tracking. For ex:
python track.py --path=utils/MPII.jpg
-
Model (--model, -m): Explore choice of model (EfficientPose RT - IV or EfficientPose RT Lite - II Lite) depending on your computational resources and precision requirements. For more details, we refer to the performance comparison. For ex:
python track.py --model=II_Lite
-
Framework (--framework, -f): Have specific preference of deep learning framework? We provide models in Keras, TensorFlow, TFLite and PyTorch. In general, TensorFlow is recommended for maximal precision with low computational overhead on GPU, while TFLite (and PyTorch in case of ARM CPUs with QNNPACK) supports use in resource-constrained applications, such as smartphones. For ex:
python track.py --framework=tensorflow
-
Visualize predictions (--visualize, -v): Visualizes the keypoint predictions on top of the image/video you provided and stores the file in the folder of the original file. For ex:
python track.py --path=utils/MPII.jpg --visualize
-
Save predictions (--store, -s): Stores the predicted coordinates of 16 keypoints (top of head, upper neck, shoulders, elbows, wrists, thorax, pelvis, hips, knees, and ankles) from image/video/camera as a CSV file. Run:
python track.py --store
Evidence
Model | Resolution | Parameters | FLOPs | PCKh@50 (MPII val) | PCKh@10 (MPII val) | PCKh@50 (MPII test) | PCKh@10 (MPII test) |
---|---|---|---|---|---|---|---|
EfficientPose RT Lite* | 224x224 | 0.40M | 0.86G | 80.6 | 23.1 | - | - |
EfficientPose RT | 224x224 | 0.46M | 0.87G | 82.9 | 23.6 | 84.8 | 24.2 |
EfficientPose I Lite* | 256x256 | 0.59M | 1.54G | 83.7 | 27.7 | - | - |
EfficientPose I | 256x256 | 0.72M | 1.67G | 85.2 | 26.5 | - | - |
EfficientPose II Lite* | 368x368 | 1.46M | 7.25G | 87.1 | 30.8 | - | - |
EfficientPose II | 368x368 | 1.73M | 7.70G | 88.2 | 30.2 | - | - |
EfficientPose III | 480x480 | 3.23M | 23.35G | 89.5 | 30.9 | - | - |
EfficientPose IV | 600x600 | 6.56M | 72.89G | 89.8 | 35.6 | 91.2 | 36.0 |
OpenPose (Cao et al.) | 368x368 | 25.94M | 160.36G | 87.6 | 22.8 | 88.8 | 22.5 |
*EfficientPose Lite models modify the original EfficientPose architecture to target deployment on edge devices through low latency inference. In particular, the EfficientNet backbones are replaced by EfficientNet Lite models of similar scaling, parallel feature extraction and cross-resolution features are omitted, squeeze-and-excitation modules are removed, and E-swish activations are replaced by ReLU6.
All models were trained with similar optimization procedure and the precision was evaluated on the single-person MPII benchmark in terms of PCKh@50 and PCKh@10. Due to restriction in number of attempts on MPII test, only EfficientPose RT and IV, and the baseline method OpenPose were officially evaluated.
Guidelines
To achieve the optimal precision provided by the software, please adhere to the following three principles:
- Ensure there is only one person present in the image/video
- Ensure that the full body of the person is clearly visible, is near the centre of the image/video frame, and is occupying a sufficient portion of the camera view (portrait mode is recommended when feasible)
- Avoid that the subject is occluded by other objects, even partial occlusion is discouraged
Acknowledgment
The work is conducted as a collaboration between the Department of Neuromedicine and Movement Science and the Department of Computer Science at the Norwegian University of Science and Technology. State-of-the-art computational infrastructure is provided by the Norwegian Open AI Lab. We are also greatful to TensorFlow and Data Scientist Pavel Yakubovskiy for making EfficientNet and EfficientNet Lite models more widely available; this provided a great starting point for our research.
Citation
If you enjoyed this project or found the work helpful in your research, please cite the following:
@article{groos2021efficientpose,
title={EfficientPose: Scalable single-person pose estimation},
author={Groos, Daniel and Ramampiaro, Heri and Ihlen, Espen AF},
journal={Applied Intelligence},
volume={51},
number={4},
pages={2518--2533},
year={2021},
publisher={Springer}
}
Main updates
- April 2021: Provided support for TensorFlow 2
- November 2020: Launched EfficientPose Lite models in Keras, TensorFlow and TFLite
- April 2020: Launched EfficientPose models in Keras, TensorFlow, TFLite and PyTorch
Relevant resources
- Model quantization (OpenVINO, TFLite, TensorFlow.js, TensorRT, Core ML, and ONNX): EfficientPose in PINTO_model_zoo
- TensorFlow.js implementation: EfficientPose for TFJS