CNN-Inference-Engine-Quick-View
A quick view of high-performance convolution neural networks (CNNs) inference engines on mobile devices.
Runtime-speed Comparisons
Data-flow / Graph Optimization
FLOAT32-Support
Framework | Main Platform | Model Compatibility | Detection-Support | Speed Benchmarks |
---|---|---|---|---|
Bolt | CPU (ARM optimized) / x86 / Mali GPU | Caffe / Tensorflow / PyTorch / onnx | Y | Link |
TNN | CPU (ARM optimized) / Mali Adreno Apple GPU | Caffe / Tensorflow / PyTorch | Y | Link |
PPLNN | CPU (ARM/x86 optimized) / Nvidia GPU | onnx | Y | Link / Link |
Paddle-Light | CPU (ARM optimized) / Mali GPU / FPGA / NPU | Paddle / Caffe / onnx | Y | Link |
MNN | CPU (ARM optimized) / Mali GPU | Caffe / Tensorflow / onnx | Y | Link |
NCNN | CPU (ARM optimized) / Mali GPU | Caffe / PyTorch / mxnet / onnx | Y | 3rd party Link / Official Link |
MACE | CPU (ARM optimized) / Mali GPU / DSP | Caffe / Tensorflow / onnx | Y | Link |
TEngine | CPU (ARM A72 optimized) | Caffe / mxnet | Y | Link |
AutoKernel | CPU / GPU/ NPU | Caffe / mxnet / Tensorflow / PyTorch / Darknet | Y | Link |
Synet | CPU (ARM optimized) / x86 | Caffe / PyTorch / Tensorflow / mxnet / onnx | Y | - |
MsnhNet | CPU (ARM optimized) / Mali GPU / x86 / TensorRT | PyTorch | Y | Link |
ONNX-Runtime | CPU / Nvidia GPU | onnx | Y | - |
HiAI | Kirin CPU / NPU | Caffe / Tensorflow | Y | - |
NNIE | NPU | Caffe | Y | 1TOPs |
Intel-Caffe | CPU (Intel optimized) | Caffe | Y | Link |
FeatherCNN | CPU (ARM optimized) | Caffe | N | Link / unofficial Link |
Tensorflowlite | CPU (Android optimized) | Caffe2 / Tensorflow / onnx | Y | Link |
TensorRT | GPU (Volta optimized) | Caffe / Tensorflow / onnx | Y | Link |
TVM | CPU (ARM optimized) / Mali GPU / FPGA | onnx | Y | - |
SNPE | CPU (Qualcomm optimized) / GPU / DSP | Caffe / Caffe2 / Tensorflow/ onnx | Y | Link |
Pocket-Tensor | CPU (ARM/x86 optimized) | Keras | N | Link |
ZQCNN | CPU | Caffe / mxnet | Y | Link |
ARM-NEON-to-x86-SSE | CPU (Intel optimized) | Intrinsics-Level | - | - |
Simd | CPU (all platform optimized) | Intrinsics-Level | - | - |
clDNN | Intelยฎ Processor Graphics / Irisโข Pro Graphics | Caffe / Tennsorflow / mxnet / onnx | Y | Link |
FIX16-Support
Framework | Main Platform | Model Compatibility | Detection-Support | Speed Benchmarks |
---|---|---|---|---|
Bolt | CPU (ARM optimized) / x86 / Mali GPU | Caffe / Tensorflow / PyTorch | Y | Link |
ARM32-SGEMM-LIB | CPU (ARM optimized) | GEMM Library | N | Link |
TNN | CPU (ARM optimized) / Mali Adreno Apple GPU | Caffe / Tensorflow / PyTorch | Y | Link |
Yolov2-Xilinx-PYNQ | FPGA (Xilinx PYNQ) | Yolov2-only | Y | Link |
INT8-Support
Framework | Main Platform | Model Compatibility | Detection-Support | Speed Benchmarks |
---|---|---|---|---|
Bolt | CPU (ARM optimized) / x86 / Mali GPU | Caffe / Tensorflow / PyTorch | Y | Link |
Intel-Caffe | CPU (Intel Skylake) | Caffe | Y | Link |
TNN | CPU (ARM optimized) / Mali Adreno Apple GPU | Caffe / Tensorflow / PyTorch | Y | Link |
PPLNN | Nvidia GPU optimized | onnx | Y | Link |
NCNN | CPU (ARM optimized) | Caffe / pytorch / mxnet / onnx | Y | Link |
Paddle-Light | CPU (ARM optimized) / Mali GPU / FPGA | Paddle / Caffe / onnx | Y | Link |
MNN | CPU (ARM optimized) / Mali GPU | Caffe / Tensorflow / onnx | Y | Link |
Tensorflowlite | CPU (ARM) | Caffe2 / Tensorflow / onnx | Y | Link |
TensorRT | GPU (Volta) | Caffe / Tensorflow / onnx | Y | Link |
Gemmlowp | CPU (ARM / x86) | GEMM Library | - | - |
SNPE | DSP (Quantized DLC) | Caffe / Caffe2 / Tensorflow/ onnx | Y | Link |
MACE | CPU (ARM optimized) / Mali GPU / DSP | Caffe / Tensorflow / onnx | Y | Link |
TF2 | FPGA | Caffe / PyTorch / Tensorflow | Y | Link |
TVM | CPU (ARM optimized) / Mali GPU / FPGA | onnx | Y | Link |
TERNARY-Support
Framework | Main Platform | Model Compatibility | Detection-Support | Speed Benchmarks |
---|---|---|---|---|
Gemmbitserial | CPU (ARM / x86) | GEMM Library | - | Link |
BINARY-Support
Framework | Main Platform | Model Compatibility | Detection-Support | Speed Benchmarks |
---|---|---|---|---|
Bolt | CPU (ARM optimized) / x86 / Mali GPU | Caffe / Tensorflow / PyTorch | Y | Link |
BMXNET | CPU (ARM / x86) / GPU | mxnet | Y | Link |
DABNN | CPU (ARM) | Caffe / Tensorflow / onnx | N | Link |
Espresso | GPU | - | N | Link |
BNN-PYNQ | FPGA (Xilinx PYNQ) | - | N | Link |
FINN | FPGA (Xilinx) | - | N | Link |
NLP-Support
Framework | Main Platform | Model Compatibility | Speed Benchmarks |
---|---|---|---|
TurboTransformers | CPU / Nvidia GPU | PyTorch | Link |
Bolt | CPU / Mali GPU | Caffe / onnx | Link |