There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).