• Stars
    star
    165
  • Rank 227,414 (Top 5 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

AMC Compressed Models

This repo contains some of the compressed models from paper AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV18).

Reference

If you find the models useful, please kindly cite our paper:

@inproceedings{he2018amc,
  title={AMC: AutoML for Model Compression and Acceleration on Mobile Devices},
  author={He, Yihui and Lin, Ji and Liu, Zhijian and Wang, Hanrui and Li, Li-Jia and Han, Song},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  pages={784--800},
  year={2018}
}

Download the Pretrained Models

Firstly, download the pretrained models from here and put it in ./checkpoints.

Models

Compressed MobileNets

We provide compressed MobileNetV1 by 50% FLOPs and 50% Inference time, and also compressed MobileNetV2 by 70% FLOPs, with PyTorch. The comparison with vanila models as follows:

Models Top1 Acc (%) Top5 Acc (%) Latency (ms) MACs (M)
MobileNetV1 70.9 89.5 123 569
MobileNetV1-width*0.75 68.4 88.2 72.5 325
MobileNetV1-50%FLOPs 70.5 89.3 68.9 285
MobileNetV1-50%Time 70.2 89.4 63.2 272
MobileNetV2-width*0.75 69.8 89.6 - 300
MobileNetV2-70%FLOPs 70.9 89.9 - 210

To test the model, run:

python eval_mobilenet_torch.py --profile={mobilenet_0.5flops, mobilenet_0.5time, mobilenetv2_0.7flops}

Converted TensorFLow Models

We converted the 50% FLOPs and 50% time compressed MobileNetV1 model to TensorFlow. We offer the normal checkpoint format and also the TF-Lite format. We used the TF-Lite format to test the speed on MobileNet.

To replicate the results of PyTorch, we write a new preprocessing function, and also adapt some hyper-parameters from the original TF MobileNetV1. To verify the performance, run following scripts:

python eval_mobilenet_tf.py --profile={0.5flops, 0.5time}

The produced result is:

Models Top1 Acc (%) Top5 Acc (%)
50% FLOPs 70.424 89.28
50% Time 70.214 89.244

Timing Logs

Here we provide timing logs on Google Pixel 1 using TensorFlow Lite in ./logs directory. We benchmarked the original MobileNetV1 (mobilenet), MobileNetV1 with 0.75 width multiplier (0.75mobilenet), 50% FLOPs pruned MobileNetV1 (0.5flops) and 50% time pruned MobileNetV1 (0.5time). Each model is benchmarked for 200 iterations with extra 100 iterations for warming up, and repeated for 3 runs.

AMC

You can also find our PyTorch implementation of AMC Here.

Contact

To contact the authors:

Ji Lin, [email protected]

Song Han, [email protected]

More Repositories

1

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Python
6,323
star
2

bevfusion

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Python
2,153
star
3

temporal-shift-module

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Python
2,040
star
4

once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
Python
1,860
star
5

llm-awq

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python
1,687
star
6

proxylessnas

[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
C++
1,415
star
7

data-efficient-gans

[NeurIPS 2020] Differentiable Augmentation for Data-Efficient GAN Training
Python
1,272
star
8

torchquantum

A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers.
Jupyter Notebook
1,270
star
9

efficientvit

EfficientViT is a new family of vision models for efficient high-resolution vision.
Python
1,218
star
10

torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
Cuda
1,181
star
11

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Python
1,175
star
12

gan-compression

[CVPR 2020] GAN Compression: Efficient Architectures for Interactive Conditional GANs
Python
1,102
star
13

anycost-gan

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing
Python
778
star
14

tinyml

Python
732
star
15

tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory
C
717
star
16

TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library
C++
695
star
17

fastcomposer

[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Python
644
star
18

pvcnn

[NeurIPS 2019, Spotlight] Point-Voxel CNN for Efficient 3D Deep Learning
Python
636
star
19

lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention
Python
589
star
20

spvnas

[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Python
577
star
21

distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Python
538
star
22

mcunet

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning
Python
423
star
23

amc

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Python
422
star
24

tiny-training

On-Device Training Under 256KB Memory [NeurIPS'22]
Python
414
star
25

dlg

[NeurIPS 2019] Deep Leakage From Gradients
Python
375
star
26

offsite-tuning

Offsite-Tuning: Transfer Learning without Full Model
Python
365
star
27

haq

[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Python
362
star
28

hardware-aware-transformers

[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Python
321
star
29

litepose

[CVPR'22] Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Python
301
star
30

inter-operator-scheduler

[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
C++
189
star
31

apq

[CVPR 2020] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
Python
156
star
32

parallel-computing-tutorial

C++
123
star
33

flatformer

[CVPR'23] FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Python
119
star
34

patch_conv

Patch convolution to avoid large GPU memory usage of Conv2D
Python
72
star
35

6s965-fall2022

Jupyter Notebook
64
star
36

sparsevit

[CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Python
48
star
37

bnn-icestick

Binary Neural Network on IceStick FPGA.
Jupyter Notebook
47
star
38

e3d

Efficient 3D Deep Learning
46
star
39

neurips-micronet

[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion
Jupyter Notebook
40
star
40

spatten-llm

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Scala
32
star
41

tinychat-tutorial

C++
28
star
42

pruning-sparsity-publications

14
star
43

iccad-tinyml-open

[ICCAD'22 TinyML Contest] Efficient Heart Stroke Detection on Low-cost Microcontrollers
C
14
star
44

calo-cluster

Jupyter Notebook
5
star
45

ml-blood-pressure

Python
5
star
46

gan-compression-dynamic

Python
3
star
47

data-efficient-gans-dynamic

Python
3
star