EfficientDNNs
A collection of recent methods on DNN compression and acceleration. There are mainly 5 kinds of methods for efficient DNNs:
- neural architecture re-design or search (NAS)
- maintain accuracy, less cost (e.g., #Params, #FLOPs, etc.): MobileNet, ShuffleNet etc.
- maintain cost, more accuracy: Inception, ResNeXt, Xception etc.
- pruning (including structured and unstructured)
- quantization
- matrix/low-rank decomposition
- knowledge distillation (KD)
Note, this repo is more about pruning (with lottery ticket hypothesis or LTH as a sub-topic), KD, and quantization. For other topics like NAS, see more comprehensive collections (## Related Repos and Websites) at the end of this file. Welcome to send a pull request if you'd like to add any pertinent papers.
Other repos:
- LTH (lottery ticket hypothesis) and its broader version, pruning at initialization (PaI), now is at the frontier of network pruning. We single out the PaI papers to this repo. Welcome to check it out!
- Awesome-Efficient-ViT for a curated list of efficient vision transformers.
About abbreviation: In the list below,
o
for oral,s
for spotlight,b
for best paper,w
for workshop.
Surveys
- 1993-TNN-Pruning Algorithms -- A survey
- 2017-Proceedings of the IEEE-Efficient Processing of Deep Neural Networks: A Tutorial and Survey [2020 Book: Efficient Processing of Deep Neural Networks]
- 2017.12-A survey of FPGA-based neural network accelerator
- 2018-FITEE-Recent Advances in Efficient Computation of Deep Convolutional Neural Networks
- 2018-IEEE Signal Processing Magazine-Model compression and acceleration for deep neural networks: The principles, progress, and challenges. Arxiv extension
- 2018.8-A Survey on Methods and Theories of Quantized Neural Networks
- 2019-JMLR-Neural Architecture Search: A Survey
- 2020-MLSys-What is the state of neural network pruning
- 2019.02-The State of Sparsity in Deep Neural Networks
- 2021-TPAMI-Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks
- 2021-IJCV-Knowledge Distillation: A Survey
- 2020-Proceedings of the IEEE-Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey
- 2020-Pattern Recognition-Binary neural networks: A survey
- 2021-TPDS-The Deep Learning Compiler: A Comprehensive Survey
- 2021-JMLR-Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
- 2022-IJCAI-Recent Advances on Neural Network Pruning at Initialization
- 2021.6-Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Papers [Pruning and Quantization]
1980s,1990s
- 1988-NIPS-A back-propagation algorithm with optimal use of hidden units
- 1988-NIPS-Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment
- 1988-NIPS-What Size Net Gives Valid Generalization?
- 1989-NIPS-Dynamic Behavior of Constained Back-Propagation Networks
- 1988-NIPS-Comparing Biases for Minimal Network Construction with Back-Propagation
- 1989-NIPS-Optimal Brain Damage
- 1990-NN-A simple procedure for pruning back-propagation trained neural networks
- 1993-ICNN-Optimal Brain Surgeon and general network pruning
2000s
- 2001-JMLR-Sparse Bayesian learning and the relevance vector machine
- 2007-Book-The minimum description length principle
2011
- 2011-JMLR-Learning with Structured Sparsity
- 2011-NIPSw-Improving the speed of neural networks on CPUs
2013
- 2013-NIPS-Predicting Parameters in Deep Learning
- 2013.08-Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation
2014
- 2014-BMVC-Speeding up convolutional neural networks with low rank expansions
- 2014-INTERSPEECH-1-Bit Stochastic Gradient Descent and its Application to Data-Parallel Distributed Training of Speech DNNs
- 2014-NIPS-Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
- 2014-NIPS-Do deep neural nets really need to be deep
- 2014.12-Memory bounded deep convolutional networks
2015
- 2015-ICLR-Speeding-up convolutional neural networks using fine-tuned cp-decomposition
- 2015-ICML-Compressing neural networks with the hashing trick
- 2015-INTERSPEECH-A Diversity-Penalizing Ensemble Training Method for Deep Learning
- 2015-BMVC-Data-free parameter pruning for deep neural networks
- 2015-BMVC-Learning the structure of deep architectures using l1 regularization
- 2015-NIPS-Learning both Weights and Connections for Efficient Neural Network
- 2015-NIPS-Binaryconnect: Training deep neural networks with binary weights during propagations
- 2015-NIPS-Structured Transforms for Small-Footprint Deep Learning
- 2015-NIPS-Tensorizing Neural Networks
- 2015-NIPSw-Distilling Intractable Generative Models
- 2015-NIPSw-Federated Optimization:Distributed Optimization Beyond the Datacenter
- 2015-CVPR-Efficient and Accurate Approximations of Nonlinear Convolutional Networks [2016 TPAMI version: Accelerating Very Deep Convolutional Networks for Classification and Detection]
- 2015-CVPR-Sparse Convolutional Neural Networks
- 2015-ICCV-An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections
- 2015.12-Exploiting Local Structures with the Kronecker Layer in Convolutional Networks
2016
- 2016-ICLR-Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding [Best paper!]
- 2016-ICLR-All you need is a good init [Code]
- 2016-ICLR-Data-dependent Initializations of Convolutional Neural Networks [Code]
- 2016-ICLR-Convolutional neural networks with low-rank regularization [Code]
- 2016-ICLR-Diversity networks
- 2016-ICLR-Neural networks with few multiplications
- 2016-ICLR-Compression of deep convolutional neural networks for fast and low power mobile applications
- 2016-ICLRw-Randomout: Using a convolutional gradient norm to win the filter lottery
- 2016-CVPR-Fast algorithms for convolutional neural networks
- 2016-CVPR-Fast ConvNets Using Group-wise Brain Damage
- 2016-BMVC-Learning neural network architectures using backpropagation
- 2016-ECCV-Less is more: Towards compact cnns
- 2016-EMNLP-Sequence-Level Knowledge Distillation
- 2016-NIPS-Learning Structured Sparsity in Deep Neural Networks [Caffe Code]
- 2016-NIPS-Dynamic Network Surgery for Efficient DNNs [Caffe Code]
- 2016-NIPS-Learning the Number of Neurons in Deep Neural Networks
- 2016-NIPS-Memory-Efficient Backpropagation Through Time
- 2016-NIPS-PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
- 2016-NIPS-LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
- 2016-NIPS-CNNpack: packing convolutional neural networks in the frequency domain
- 2016-ISCA-Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks
- 2016-ICASSP-Learning compact recurrent neural networks
- 2016-CoNLL-Compression of Neural Machine Translation Models via Pruning
- 2016.03-Adaptive Computation Time for Recurrent Neural Networks
- 2016.06-Structured Convolution Matrices for Energy-efficient Deep learning
- 2016.06-Deep neural networks are robust to weight binarization and other non-linear distortions
- 2016.06-Hypernetworks
- 2016.07-IHT-Training skinny deep neural networks with iterative hard thresholding methods
- 2016.08-Recurrent Neural Networks With Limited Numerical Precision
- 2016.10-Deep model compression: Distilling knowledge from noisy teachers
- 2016.10-Federated Optimization: Distributed Machine Learning for On-Device Intelligence
- 2016.11-Alternating Direction Method of Multipliers for Sparse Convolutional Neural Networks
2017
- 2017-ICLR-Pruning Filters for Efficient ConvNets [PyTorch Reimpl. #1] [PyTorch Reimpl. #2]
- 2017-ICLR-Pruning Convolutional Neural Networks for Resource Efficient Inference
- 2017-ICLR-Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights [Code]
- 2017-ICLR-Do Deep Convolutional Nets Really Need to be Deep and Convolutional?
- 2017-ICLR-DSD: Dense-Sparse-Dense Training for Deep Neural Networks
- 2017-ICLR-Faster CNNs with Direct Sparse Convolutions and Guided Pruning
- 2017-ICLR-Towards the Limit of Network Quantization
- 2017-ICLR-Loss-aware Binarization of Deep Networks
- 2017-ICLR-Trained Ternary Quantization [Code]
- 2017-ICLR-Exploring Sparsity in Recurrent Neural Networks
- 2017-ICLR-Soft Weight-Sharing for Neural Network Compression [Reddit discussion] [Code]
- 2017-ICLR-Variable Computation in Recurrent Neural Networks
- 2017-ICLR-Training Compressed Fully-Connected Networks with a Density-Diversity Penalty
- 2017-ICML-Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
- 2017-ICML-Deep Tensor Convolution on Multicores
- 2017-ICML-Delta Networks for Optimized Recurrent Network Computation
- 2017-ICML-Beyond Filters: Compact Feature Map for Portable Deep Model
- 2017-ICML-Combined Group and Exclusive Sparsity for Deep Neural Networks
- 2017-ICML-MEC: Memory-efficient Convolution for Deep Neural Network
- 2017-ICML-Deciding How to Decide: Dynamic Routing in Artificial Neural Networks
- 2017-ICML-ZipML: Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning
- 2017-ICML-Analytical Guarantees on Numerical Precision of Deep Neural Networks
- 2017-ICML-Adaptive Neural Networks for Efficient Inference
- 2017-ICML-SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization
- 2017-CVPR-Learning deep CNN denoiser prior for image restoration
- 2017-CVPR-Deep roots: Improving cnn efficiency with hierarchical filter groups
- 2017-CVPR-More is less: A more complicated network with less inference complexity [PyTorch Code]
- 2017-CVPR-All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation
- 2017-CVPR-ResNeXt-Aggregated Residual Transformations for Deep Neural Networks
- 2017-CVPR-Xception: Deep learning with depthwise separable convolutions
- 2017-CVPR-Designing Energy-Efficient CNN using Energy-aware Pruning
- 2017-CVPR-Spatially Adaptive Computation Time for Residual Networks
- 2017-CVPR-Network Sketching: Exploiting Binary Structure in Deep CNNs
- 2017-CVPR-A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation
- 2017-ICCV-Channel pruning for accelerating very deep neural networks [Caffe Code]
- 2017-ICCV-Learning efficient convolutional networks through network slimming [PyTorch Code]
- 2017-ICCV-ThiNet: A filter level pruning method for deep neural network compression [Project] [Caffe Code] [2018 TPAMI version]
- 2017-ICCV-Interleaved group convolutions
- 2017-ICCV-Coordinating Filters for Faster Deep Neural Networks [Caffe Code]
- 2017-ICCV-Performance Guaranteed Network Acceleration via High-Order Residual Quantization
- 2017-NIPS-Net-trim: Convex pruning of deep neural networks with performance guarantee [Code] (Journal version: 2020-SIAM-Fast Convex Pruning of Deep Neural Networks)
- 2017-NIPS-Runtime neural pruning
- 2017-NIPS-Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon [Code]
- 2017-NIPS-Federated Multi-Task Learning
- 2017-NIPS-Towards Accurate Binary Convolutional Neural Network
- 2017-NIPS-Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations
- 2017-NIPS-TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning
- 2017-NIPS-Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
- 2017-NIPS-Training Quantized Nets: A Deeper Understanding
- 2017-NIPS-The Reversible Residual Network: Backpropagation Without Storing Activations [Code]
- 2017-NIPS-Compression-aware Training of Deep Networks
- 2017-FPGA-ESE: efficient speech recognition engine with compressed LSTM on FPGA [Best paper!]
- 2017-AISTATS-Communication-Efficient Learning of Deep Networks from Decentralized Data
- 2017-ICASSP-Accelerating Deep Convolutional Networks using low-precision and sparsity
- 2017-NNs-Nonredundant sparse feature extraction using autoencoders with receptive fields clustering
- 2017.02-The Power of Sparsity in Convolutional Neural Networks
- 2017.07-Stochastic, Distributed and Federated Optimization for Machine Learning
- 2017.05-Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning
- 2017.07-Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM
- 2017.11-GPU Kernels for Block-Sparse Weights [Code] (OpenAI)
- 2017.11-Block-sparse recurrent neural networks
2018
- 2018-AAAI-Auto-balanced Filter Pruning for Efficient Convolutional Neural Networks
- 2018-AAAI-Deep Neural Network Compression with Single and Multiple Level Quantization
- 2018-AAAI-Dynamic Deep Neural Networks_Optimizing Accuracy-Efficiency Trade-offs by Selective Execution
- 2018-ICLRo-Training and Inference with Integers in Deep Neural Networks
- 2018-ICLR-Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers
- 2018-ICLR-N2N learning: Network to Network Compression via Policy Gradient Reinforcement Learning
- 2018-ICLR-Model compression via distillation and quantization
- 2018-ICLR-Towards Image Understanding from Deep Compression Without Decoding
- 2018-ICLR-Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
- 2018-ICLR-Mixed Precision Training of Convolutional Neural Networks using Integer Operations
- 2018-ICLR-Mixed Precision Training
- 2018-ICLR-Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy
- 2018-ICLR-Loss-aware Weight Quantization of Deep Networks
- 2018-ICLR-Alternating Multi-bit Quantization for Recurrent Neural Networks
- 2018-ICLR-Adaptive Quantization of Neural Networks
- 2018-ICLR-Variational Network Quantization
- 2018-ICLR-Espresso: Efficient Forward Propagation for Binary Deep Neural Networks
- 2018-ICLR-Learning to share: Simultaneous parameter tying and sparsification in deep learning
- 2018-ICLR-Learning Sparse Neural Networks through L0 Regularization
- 2018-ICLR-WRPN: Wide Reduced-Precision Networks
- 2018-ICLR-Deep rewiring: Training very sparse deep networks
- 2018-ICLR-Efficient sparse-winograd convolutional neural networks [Code]
- 2018-ICLR-Learning Intrinsic Sparse Structures within Long Short-term Memory
- 2018-ICLR-Multi-scale dense networks for resource efficient image classification
- 2018-ICLR-Compressing Word Embedding via Deep Compositional Code Learning
- 2018-ICLR-Learning Discrete Weights Using the Local Reparameterization Trick
- 2018-ICLR-Training wide residual networks for deployment using a single bit for each weight
- 2018-ICLR-The High-Dimensional Geometry of Binary Neural Networks
- 2018-ICLRw-To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression (Similar topic: 2018-NIPSw-nip in the bud, 2018-NIPSw-rethink)
- 2018-CVPR-Context-Aware Deep Feature Compression for High-Speed Visual Tracking
- 2018-CVPR-NISP: Pruning Networks using Neuron Importance Score Propagation
- 2018-CVPR-Condensenet: An efficient densenet using learned group convolutions [Code]
- 2018-CVPR-Shift: A zero flop, zero parameter alternative to spatial convolutions
- 2018-CVPR-Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks
- 2018-CVPR-Interleaved structured sparse convolutional neural networks
- 2018-CVPR-Towards Effective Low-bitwidth Convolutional Neural Networks
- 2018-CVPR-CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization
- 2018-CVPR-Blockdrop: Dynamic inference paths in residual networks
- 2018-CVPR-Nestednet: Learning nested sparse structures in deep neural networks
- 2018-CVPR-Stochastic downsampling for cost-adjustable inference and improved regularization in convolutional networks
- 2018-CVPR-Wide Compression: Tensor Ring Nets
- 2018-CVPR-Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition
- 2018-CVPR-Learning Time/Memory-Efficient Deep Architectures With Budgeted Super Networks
- 2018-CVPR-HydraNets: Specialized Dynamic Architectures for Efficient Inference
- 2018-CVPR-SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks
- 2018-CVPR-Towards Effective Low-Bitwidth Convolutional Neural Networks
- 2018-CVPR-Two-Step Quantization for Low-Bit Neural Networks
- 2018-CVPR-Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
- 2018-CVPR-"Learning-Compression" Algorithms for Neural Net Pruning
- 2018-CVPR-PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning [Code]
- 2018-CVPR-MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks [Code]
- 2018-CVPR-ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- 2018-CVPRw-Squeezenext: Hardware-aware neural network design
- 2018-IJCAI-Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error
- 2018-IJCAI-Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks [PyTorch Code]
- 2018-IJCAI-Where to Prune: Using LSTM to Guide End-to-end Pruning
- 2018-IJCAI-Accelerating Convolutional Networks via Global & Dynamic Filter Pruning
- 2018-IJCAI-Optimization based Layer-wise Magnitude-based Pruning for DNN Compression
- 2018-IJCAI-Progressive Blockwise Knowledge Distillation for Neural Network Acceleration
- 2018-IJCAI-Complementary Binary Quantization for Joint Multiple Indexing
- 2018-ICML-Compressing Neural Networks using the Variational Information Bottleneck
- 2018-ICML-DCFNet: Deep Neural Network with Decomposed Convolutional Filters
- 2018-ICML-Deep k-Means Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions
- 2018-ICML-Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization
- 2018-ICML-High Performance Zero-Memory Overhead Direct Convolutions
- 2018-ICML-Kronecker Recurrent Units
- 2018-ICML-Weightless: Lossy weight encoding for deep neural network compression
- 2018-ICML-StrassenNets: Deep learning with a multiplication budget
- 2018-ICML-Learning Compact Neural Networks with Regularization
- 2018-ICML-WSNet: Compact and Efficient Networks Through Weight Sampling
- 2018-ICML-Gradually Updated Neural Networks for Large-Scale Image Recognition [Code]
- 2018-ICML-On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
- 2018-ICML-Understanding and simplifying one-shot architecture search
- 2018-ECCV-A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers [Code]
- 2018-ECCV-Coreset-Based Neural Network Compression
- 2018-ECCV-Data-Driven Sparse Structure Selection for Deep Neural Networks [MXNet Code]
- 2018-ECCV-Training Binary Weight Networks via Semi-Binary Decomposition
- 2018-ECCV-Learning Compression from Limited Unlabeled Data
- 2018-ECCV-Constraint-Aware Deep Neural Network Compression
- 2018-ECCV-Sparsely Aggregated Convolutional Networks
- 2018-ECCV-Deep Expander Networks: Efficient Deep Networks from Graph Theory [Code]
- 2018-ECCV-SparseNet-Sparsely Aggregated Convolutional Networks [Code]
- 2018-ECCV-Ask, acquire, and attack: Data-free uap generation using class impressions
- 2018-ECCV-Netadapt: Platform-aware neural network adaptation for mobile applications
- 2018-ECCV-Clustering Convolutional Kernels to Compress Deep Neural Networks
- 2018-ECCV-Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm
- 2018-ECCV-Extreme Network Compression via Filter Group Approximation
- 2018-ECCV-Convolutional Networks with Adaptive Inference Graphs
- 2018-ECCV-SkipNet: Learning Dynamic Routing in Convolutional Networks [Code]
- 2018-ECCV-Value-aware Quantization for Training and Inference of Neural Networks
- 2018-ECCV-LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
- 2018-ECCV-AMC: AutoML for Model Compression and Acceleration on Mobile Devices
- 2018-ECCV-Piggyback: Adapting a single network to multiple tasks by learning to mask weights
- 2018-BMVCo-Structured Probabilistic Pruning for Convolutional Neural Network Acceleration
- 2018-BMVC-Efficient Progressive Neural Architecture Search
- 2018-BMVC-Igcv3: Interleaved lowrank group convolutions for efficient deep neural networks
- 2018-NIPS-Discrimination-aware Channel Pruning for Deep Neural Networks
- 2018-NIPS-Frequency-Domain Dynamic Pruning for Convolutional Neural Networks
- 2018-NIPS-ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions
- 2018-NIPS-DropBlock: A regularization method for convolutional networks
- 2018-NIPS-Constructing fast network through deconstruction of convolution
- 2018-NIPS-Learning Versatile Filters for Efficient Convolutional Neural Networks [Code]
- 2018-NIPS-Moonshine: Distilling with cheap convolutions
- 2018-NIPS-HitNet: hybrid ternary recurrent neural network
- 2018-NIPS-FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network
- 2018-NIPS-Training DNNs with Hybrid Block Floating Point
- 2018-NIPS-Reversible Recurrent Neural Networks
- 2018-NIPS-Synaptic Strength For Convolutional Neural Network
- 2018-NIPS-Learning sparse neural networks via sensitivity-driven regularization
- 2018-NIPS-Multi-Task Zipping via Layer-wise Neuron Sharing
- 2018-NIPS-A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication
- 2018-NIPS-Gradient Sparsification for Communication-Efficient Distributed Optimization
- 2018-NIPS-GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training
- 2018-NIPS-ATOMO: Communication-efficient Learning via Atomic Sparsification
- 2018-NIPS-Norm matters: efficient and accurate normalization schemes in deep networks
- 2018-NIPS-Sparsified SGD with memory
- 2018-NIPS-Pelee: A Real-Time Object Detection System on Mobile Devices
- 2018-NIPS-Scalable methods for 8-bit training of neural networks
- 2018-NIPS-TETRIS: TilE-matching the TRemendous Irregular Sparsity
- 2018-NIPS-Training deep neural networks with 8-bit floating point numbers
- 2018-NIPS-Multiple instance learning for efficient sequential data classification on resource-constrained devices
- 2018-NIPS-Sparse dnns with improved adversarial robustness
- 2018-NIPSw-Pruning neural networks: is it time to nip it in the bud?
- 2018-NIPSw-Rethinking the Value of Network Pruning [2019 ICLR version] [PyTorch Code]
- 2018-NIPSw-Structured Pruning for Efficient ConvNets via Incremental Regularization [2019 IJCNN version] [Caffe Code]
- 2018-NIPSw-Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling
- 2018-NIPSw-Learning Sparse Networks Using Targeted Dropout [OpenReview] [Code]
- 2018-WACV-Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
- 2018.05-Compression of Deep Convolutional Neural Networks under Joint Sparsity Constraints
- 2018.05-AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference
- 2018.10-A Closer Look at Structured Pruning for Neural Network Compression [Code]
- 2018.11-Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs
- 2018.11-PydMobileNet: Improved Version of MobileNets with Pyramid Depthwise Separable Convolution
2019
- 2019-MLSys-Towards Federated Learning at Scale: System Design
- 2019-MLsys-To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression
- 2019-ICLR-Slimmable Neural Networks [Code]
- 2019-ICLR-Defensive Quantization: When Efficiency Meets Robustness
- 2019-ICLR-Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters [Code]
- 2019-ICLR-ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware [Code]
- 2019-ICLR-SNIP: Single-shot Network Pruning based on Connection Sensitivity
- 2019-ICLR-Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach
- 2019-ICLR-Dynamic Channel Pruning: Feature Boosting and Suppression
- 2019-ICLR-Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking
- 2019-ICLR-RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
- 2019-ICLR-Dynamic Sparse Graph for Efficient Deep Learning
- 2019-ICLR-Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
- 2019-ICLR-Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds
- 2019-ICLR-Learning Recurrent Binary/Ternary Weights
- 2019-ICLR-Double Viterbi: Weight Encoding for High Compression Ratio and Fast On-Chip Reconstruction for Deep Neural Network
- 2019-ICLR-Relaxed Quantization for Discretized Neural Networks
- 2019-ICLR-Integer Networks for Data Compression with Latent-Variable Models
- 2019-ICLR-Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
- 2019-ICLR-Analysis of Quantized Models
- 2019-ICLR-DARTS: Differentiable Architecture Search [Code]
- 2019-ICLR-Graph HyperNetworks for Neural Architecture Search
- 2019-ICLR-Learnable Embedding Space for Efficient Neural Architecture Compression [Code]
- 2019-ICLR-Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution
- 2019-ICLR-SNAS: stochastic neural architecture search
- 2019-AAAIo-A layer decomposition-recomposition framework for neuron pruning towards accurate lightweight networks
- 2019-AAAI-Balanced Sparsity for Efficient DNN Inference on GPU [Code]
- 2019-AAAI-CircConv: A Structured Convolution with Low Complexity
- 2019-AAAI-Regularized Evolution for Image Classifier Architecture Search
- 2019-AAAI-Universal Approximation Property and Equivalence of Stochastic Computing-Based Neural Networks and Binary Neural Networks
- 2019-WACV-DAC: Data-free Automatic Acceleration of Convolutional Networks
- 2019-ASPLOS-Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
- 2019-CVPRo-HAQ: hardware-aware automated quantization
- 2019-CVPRo-Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [Code]
- 2019-CVPR-All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
- 2019-CVPR-Importance Estimation for Neural Network Pruning [Code]
- 2019-CVPR-HetConv Heterogeneous Kernel-Based Convolutions for Deep CNNs
- 2019-CVPR-Fully Learnable Group Convolution for Acceleration of Deep Neural Networks
- 2019-CVPR-Towards Optimal Structured CNN Pruning via Generative Adversarial Learning
- 2019-CVPR-ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation
- 2019-CVPR-Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search [Code]
- 2019-CVPR-Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation [Code]
- 2019-CVPR-MnasNet: Platform-Aware Neural Architecture Search for Mobile [Code]
- 2019-CVPR-MFAS: Multimodal Fusion Architecture Search
- 2019-CVPR-A Neurobiological Evaluation Metric for Neural Network Model Search
- 2019-CVPR-Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
- 2019-CVPR-Efficient Neural Network Compression [Code]
- 2019-CVPR-T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor
- 2019-CVPR-Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure [Code]
- 2019-CVPR-DSC: Dense-Sparse Convolution for Vectorized Inference of Convolutional Neural Networks
- 2019-CVPR-DupNet: Towards Very Tiny Quantized CNN With Improved Accuracy for Face Detection
- 2019-CVPR-ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model
- 2019-CVPR-Variational Convolutional Neural Network Pruning
- 2019-CVPR-Accelerating Convolutional Neural Networks via Activation Map Compression
- 2019-CVPR-Compressing Convolutional Neural Networks via Factorized Convolutional Filters
- 2019-CVPR-Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks
- 2019-CVPR-Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression
- 2019-CVPR-MBS: Macroblock Scaling for CNN Model Reduction
- 2019-CVPR-On Implicit Filter Level Sparsity in Convolutional Neural Networks
- 2019-CVPR-Structured Pruning of Neural Networks With Budget-Aware Regularization
- 2019-CVPRo-Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization [Code]
- 2019-ICML-Approximated Oracle Filter Pruning for Destructive CNN Width Optimization [Code]
- 2019-ICML-EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis [PyTorch Code]
- 2019-ICML-Zero-Shot Knowledge Distillation in Deep Networks [Code]
- 2019-ICML-LegoNet: Efficient Convolutional Neural Networks with Lego Filters [Code]
- 2019-ICML-EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [Code]
- 2019-ICML-Collaborative Channel Pruning for Deep Networks
- 2019-ICML-Training CNNs with Selective Allocation of Channels
- 2019-ICML-NAS-Bench-101: Towards Reproducible Neural Architecture Search [Code]
- 2019-ICML-Learning fast algorithms for linear transforms using butterfly factorizations
- 2019-ICMLw-Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks [Code] (AutoML workshop)
- 2019-IJCAI-Play and Prune: Adaptive Filter Pruning for Deep Model Compression
- 2019-BigComp-Towards Robust Compressed Convolutional Neural Networks
- 2019-ICCV-Rethinking ImageNet Pre-training
- 2019-ICCV-Universally Slimmable Networks and Improved Training Techniques
- 2019-ICCV-MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning [Code]
- 2019-ICCV-Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation [Code]
- 2019-ICCV-Data-Free Quantization through Weight Equalization and Bias Correction
- 2019-ICCV-ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks
- 2019-ICCV-Adversarial Robustness vs. Model Compression, or Both? [PyTorch Code]
- 2019-NIPS-Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
- 2019-NIPS-Model Compression with Adversarial Robustness: A Unified Optimization Framework
- 2019-NIPS-AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters
- 2019-NIPS-Double Quantization for Communication-Efficient Distributed Optimization
- 2019-NIPS-Focused Quantization for Sparse CNNs
- 2019-NIPS-E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings
- 2019-NIPS-MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization
- 2019-NIPS-Random Projections with Asymmetric Quantization
- 2019-NIPS-Network Pruning via Transformable Architecture Search [Code]
- 2019-NIPS-Point-Voxel CNN for Efficient 3D Deep Learning [Code]
- 2019-NIPS-Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks [PyTorch Code]
- 2019-NIPS-A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off
- 2019-NIPS-Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations
- 2019-NIPS-Post training 4-bit quantization of convolutional networks for rapid-deployment
- 2019-PR-Filter-in-Filter: Improve CNNs in a Low-cost Way by Sharing Parameters among the Sub-filters of a Filter
- 2019-PRL-BDNN: Binary Convolution Neural Networks for Fast Object Detection
- 2019-TNNLS-Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning [Code]
- 2019.03-Network Slimming by Slimmable Networks: Towards One-Shot Architecture Search for Channel Numbers [Code]
- 2019.03-Single Path One-Shot Neural Architecture Search with Uniform Sampling
- 2019.04-Resource Efficient 3D Convolutional Neural Networks
- 2019.04-Meta Filter Pruning to Accelerate Deep Convolutional Neural Networks
- 2019.04-Knowledge Squeezed Adversarial Network Compression
- 2019.05-Dynamic Neural Network Channel Execution for Efficient Training
- 2019.06-AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
- 2019.06-BasisConv: A method for compressed representation and learning in CNNs
- 2019.06-BlockSwap: Fisher-guided Block Substitution for Network Compression
- 2019.06-Separable Layers Enable Structured Efficient Linear Substitutions [Code]
- 2019.06-Butterfly Transform: An Efficient FFT Based Neural Architecture Design
- 2019.06-A Taxonomy of Channel Pruning Signals in CNNs
- 2019.08-Adversarial Neural Pruning with Latent Vulnerability Suppression
- 2019.09-Training convolutional neural networks with cheap convolutions and online distillation
- 2019.09-Pruning from Scratch
- 2019.11-Adversarial Interpolation Training: A Simple Approach for Improving Model Robustness
- 2019.11-A Programmable Approach to Model Compression [Code]
2020
- 2020-AAAI-Pconv: The missing but desirable sparsity in dnn weight pruning for real-time execution on mobile devices
- 2020-AAAI-Channel Pruning Guided by Classification Loss and Feature Importance
- 2020-AAAI-Pruning from Scratch
- 2020-AAAI-Harmonious Coexistence of Structured Weight Pruning and Ternarization for Deep Neural Networks
- 2020-AAAI-AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates
- 2020-AAAI-DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks
- 2020-AAAI-Real-Time Object Tracking via Meta-Learning: Efficient Model Adaptation and One-Shot Channel Pruning
- 2020-AAAI-Dynamic Network Pruning with Interpretable Layerwise Channel Selection
- 2020-AAAI-Reborn Filters: Pruning Convolutional Neural Networks with Limited Data
- 2020-AAAI-Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio
- 2020-AAAI-Sparsity-inducing Binarized Neural Networks
- 2020-AAAI-Structured Sparsification of Gated Recurrent Neural Networks
- 2020-AAAI-Hierarchical Knowledge Squeezed Adversarial Network Compression
- 2020-AAAI-Embedding Compression with Isotropic Iterative Quantization
- 2020-ICLR-Comparing Rewinding and Fine-tuning in Neural Network Pruning [Code]
- 2020-ICLR-Lookahead: A Far-sighted Alternative of Magnitude-based Pruning [Code]
- 2020-ICLR-Dynamic Model Pruning with Feedback
- 2020-ICLR-Provable Filter Pruning for Efficient Neural Networks
- 2020-ICLR-Data-Independent Neural Pruning via Coresets
- 2020-ICLR-FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary
- 2020-ICLR-Probabilistic Connection Importance Inference and Lossless Compression of Deep Neural Networks
- 2020-ICLR-Neural Epitome Search for Architecture-Agnostic Network Compression
- 2020-ICLR-One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
- 2020-ICLR-DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures [Code]
- 2020-ICLR-Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers
- 2020-ICLR-Scalable Model Compression by Entropy Penalized Reparameterization
- 2020-ICLR-A Signal Propagation Perspective for Pruning Neural Networks at Initialization
- 2020-CVPR-GhostNet: More Features from Cheap Operations [Code]
- 2020-CVPR-Filter Grafting for Deep Neural Networks
- 2020-CVPR-Low-rank Compression of Neural Nets: Learning the Rank of Each Layer
- 2020-CVPR-Structured Compression by Weight Encryption for Unstructured Pruning and Quantization
- 2020-CVPR-Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration
- 2020-CVPR-APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
- 2020-CVPR-Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression [Code]
- 2020-CVPR-Neural Network Pruning With Residual-Connections and Limited-Data
- 2020-CVPR-Multi-Dimensional Pruning: A Unified Framework for Model Compression
- 2020-CVPR-Discrete Model Compression With Resource Constraint for Deep Neural Networks
- 2020-CVPR-Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach
- 2020-CVPR-Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer
- 2020-CVPR-The Knowledge Within: Methods for Data-Free Model Compression
- 2020-CVPR-GAN Compression: Efficient Architectures for Interactive Conditional GANs [Code]
- 2020-CVPR-Few Sample Knowledge Distillation for Efficient Network Compression
- 2020-CVPR-Fast sparse convnets
- 2020-CVPR-Structured Multi-Hashing for Model Compression
- 2020-CVPRo-AdderNet: Do We Really Need Multiplications in Deep Learning? [Code]
- 2020-CVPRo-Towards Efficient Model Compression via Learned Global Ranking [Code]
- 2020-CVPRo-HRank: Filter Pruning Using High-Rank Feature Map [Code]
- 2020-CVPRo-DaST: Data-free Substitute Training for Adversarial Attacks [Code]
- 2020-ICML-PENNI: Pruned Kernel Sharing for Efficient CNN Inference [Code]
- 2020-ICML-Operation-Aware Soft Channel Pruning using Differentiable Masks
- 2020-ICML-DropNet: Reducing Neural Network Complexity via Iterative Pruning
- 2020-ICML-Network Pruning by Greedy Subnetwork Selection
- 2020-ICML-AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks
- 2020-ICML-Soft Threshold Weight Reparameterization for Learnable Sparsity [PyTorch Code]
- 2020-ICML-Activation sparsity: Inducing and exploiting activation sparsity for fast inference on deep neural networks
- 2020-EMNLP-Structured Pruning of Large Language Models [Code]
- 2020-NIPS-Pruning neural networks without any data by iteratively conserving synaptic flow
- 2020-NIPS-Neuron-level Structured Pruning using Polarization Regularizer
- 2020-NIPS-SCOP: Scientific Control for Reliable Neural Network Pruning
- 2020-NIPS-Directional Pruning of Deep Neural Networks
- 2020-NIPS-Storage Efficient and Dynamic Flexible Runtime Channel Pruning via Deep Reinforcement Learning
- 2020-NIPS-Pruning Filter in Filter
- 2020-NIPS-HYDRA: Pruning Adversarially Robust Neural Networks
- 2020-NIPS-Movement Pruning: Adaptive Sparsity by Fine-Tuning
- 2020-NIPS-Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
- 2020-NIPS-Position-based Scaled Gradient for Model Quantization and Pruning
- 2020-NIPS-The Generalization-Stability Tradeoff In Neural Network Pruning
- 2020-NIPS-FleXOR: Trainable Fractional Quantization
- 2020-NIPS-Adaptive Gradient Quantization for Data-Parallel SGD
- 2020-NIPS-Robust Quantization: One Model to Rule Them All
- 2020-NIPS-HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
- 2020-NIPS-Efficient Exact Verification of Binarized Neural Networks
- 2020-NIPS-Ultra-Low Precision 4-bit Training of Deep Neural Networks
- 2020-NIPS-Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks
- 2020-NIPS-Fast fourier convolution
2021
- 2021-WACV-CAP: Context-Aware Pruning for Semantic Segmentation [Code]
- 2021-AAAI-Few Shot Network Compression via Cross Distillation
- 2021-AAAI-Conditional Channel Pruning for Automated Model Compression [Code]
- 2021-ICLR-Neural Pruning via Growing Regularization [PyTorch Code]
- 2021-ICLR-Network Pruning That Matters: A Case Study on Retraining Variants
- 2021-ICLR-ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations
- 2021-ICLR-A Gradient Flow Framework For Analyzing Network Pruning (Spotlight)
- 2021-CVPR-Towards Compact CNNs via Collaborative Compression
- 2021-CVPR-Manifold Regularized Dynamic Network Pruning
- 2021-CVPR-Learnable Companding Quantization for Accurate Low-bit Neural Networks
- 2021-CVPR-Diversifying Sample Generation for Accurate Data-Free Quantization
- 2021-CVPR-Zero-shot Adversarial Quantization [Oral] [Code]
- 2021-CVPR-Network Quantization with Element-wise Gradient Scaling [Project]
- 2021-ICML-Group Fisher Pruning for Practical Network Compression [Code]
- 2021-ICML-Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework
- 2021-ICML-A Probabilistic Approach to Neural Network Pruning
- 2021-ICML-On the Predictability of Pruning Across Scales
- 2021-ICML-Sparsifying Networks via Subdifferential Inclusion
- 2021-ICML-Selfish Sparse RNN Training [Code]
- 2021-ICML-Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training [Code]
- 2021-ICML-Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling
- 2021-ICML-ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
- 2021-ICML-Leveraging Sparse Linear Layers for Debuggable Deep Networks
- 2021-ICML-PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data
- 2021-ICML-BASE Layers: Simplifying Training of Large, Sparse Models [Code]
- 2021-ICML-Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset
- 2021-ICML-I-BERT: Integer-only BERT Quantization
- 2021-ICML-Training Quantized Neural Networks to Global Optimality via Semidefinite Programming
- 2021-ICML-Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution
- 2021-ICML-Communication-Efficient Distributed Optimization with Quantized Preconditioners
- 2021-NIPS-Aligned Structured Sparsity Learning for Efficient Image Super-Resolution [Code] (Spotlight!)
- 2021-NIPS-Scatterbrain: Unifying Sparse and Low-rank Attention [Code]
- 2021-NIPS-Only Train Once: A One-Shot Neural Network Training And Pruning Framework [Code]
- 2021-NIPS-CHIP: CHannel Independence-based Pruning for Compact Neural Networks [Code]
- 2021.5-Dynamical Isometry: The Missing Ingredient for Neural Network Pruning
2022
- 2022-AAAI-Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better [Code]
- 2022-ICLR-Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
- 2022-NIPS-Pruning has a disparate impact on model accuracy
- 2022-NIPS-Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions
- 2022-NIPS-Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
- 2022-NIPS-Data-Efficient Structured Pruning via Submodular Optimization
- 2022-NIPS-A Fast Post-Training Pruning Framework for Transformers
- 2022-NIPS-SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization
- 2022-NIPS-Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
- 2022-NIPS-Structural Pruning via Latency-Saliency Knapsack
- 2022-NIPS-Sparse Probabilistic Circuits via Pruning and Growing
- 2022-NIPS-Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropogation
- 2022-NIPS-SInGE: Sparsity via Integrated Gradients Estimation of Neuron Relevance
- 2022-NIPS-VTC-LFC: Vision Transformer Compression with Low-Frequency Components
- 2022-NIPS-Weighted Mutual Learning with Diversity-Driven Model Compression
- 2022-NIPS-Resource-Adaptive Federated Learning with All-In-One Neural Composition
- 2022-NIPS-Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints
- 2022-NIPS-On Measuring Excess Capacity in Neural Networks
- 2022-NIPS-Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks
- 2022-NIPS-Deep Compression of Pre-trained Transformer Models
- 2022-NIPS-Sparsity in Continuous-Depth Neural Networks
- 2022-NIPS-Spartan: Differentiable Sparsity via Regularized Transportation
- 2022-NIPS-Accelerated Projected Gradient Algorithms for Sparsity Constrained Optimization Problems
- 2022-NIPS-Feature Learning in L2-regularized DNNs: Attraction/Repulsion and Sparsity
- 2022-NIPS-Learning Best Combination for Efficient N:M Sparsity
- 2022-NIPS-Accelerating Sparse Convolution with Column Vector-Wise Sparsity
- 2022-NIPS-Differentially Private Model Compression
- 2022-NIPS-Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost
- 2022-NIPS-A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
- 2022-NIPS-Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
- 2022-NIPS-Learning sparse features can lead to overfitting in neural networks
- 2022-NIPS-Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
- 2022-NIPS-EfficientFormer: Vision Transformers at MobileNet Speed
- 2022-NIPS-Revisiting Sparse Convolutional Model for Visual Recognition
- 2022-NIPS-Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
- 2022-NIPS-A Theoretical View on Sparsely Activated Networks
- 2022-NIPS-Dynamic Sparse Network for Time Series Classification: Learning What to “See”
- 2022-NIPS-Spatial Pruned Sparse Convolution for Efficient 3D Object Detection
- 2022-NIPS-Sparse Structure Search for Delta Tuning
- 2022-NIPS-Beyond L1: Faster and Better Sparse Models with skglm
- 2022-NIPS-On the Representation Collapse of Sparse Mixture of Experts
- 2022-NIPS-M³ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
- 2022-NIPS-On-Device Training Under 256KB Memory
- 2022-NIPS-Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
2023
- 2023-ICLR-Trainability Preserving Neural Pruning [Code]
- 2023-ICLR-NTK-SAP: Improving neural network pruning by aligning training dynamics [Code]
- 2023-CVPR-DepGraph: Towards Any Structural Pruning[code]
- 2023-CVPR-CP3: Channel Pruning Plug-in for Point-based Networks
Papers [Actual Acceleration via Sparsity]
- 2018-ICML-Efficient Neural Audio Synthesis
- 2018-NIPS-Tetris: Tile-matching the tremendous irregular sparsity
- 2021.4-Accelerating Sparse Deep Neural Networks (White paper from NVIDIA)
- 2021-ICLR-Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [Code]
- 2021-NIPS-Channel Permutations for N: M Sparsity [Code: NVIDIA ASP]
- 2021-NIPS-Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
- 2021-ICLR-Learning N:M fine-grained structured sparse neural networks from scratch [Code] [Slides]
- 2022-NIPS-UDC: Unified DNAS for Compressible TinyML Models for Neural Processing Units
Papers [Lottery Ticket Hypothesis (LTH)]
For LTH and other Pruning at Initialization papers, please refer to Awesome-Pruning-at-Initialization.
Papers [Bayesian Compression]
- 1995-Neural Computation-Bayesian Regularisation and Pruning using a Laplace Prior
- 1997-Neural Networks-Regularization with a Pruning Prior
- 2015-NIPS-Bayesian dark knowledge
- 2017-NIPS-Bayesian Compression for Deep Learning [Code]
- 2017-ICML-Variational dropout sparsifies deep neural networks
- 2017-NIPSo-Structured Bayesian Pruning via Log-Normal Multiplicative Noise
- 2017-ICMLw-Bayesian Sparsification of Recurrent Neural Networks
- 2020-NIPS-Bayesian Bits: Unifying Quantization and Pruning
Papers [Knowledge Distillation (KD)]
Before 2014
- 1996-Born again trees (proposed compressing neural networks and multipletree predictors by approximating them with a single tree)
- 2006-SIGKDD-Model compression
- 2010-ML-A theory of learning from different domains
2014
- 2014-NIPS-Do deep nets really need to be deep?
- 2014-NIPSw-Distilling the Knowledge in a Neural Network [Code]
2016
- 2016-ICLR-Net2net: Accelerating learning via knowledge transfer
- 2016-ECCV-Accelerating convolutional neural networks with dominant convolutional kernel and knowledge pre-regression
2017
- 2017-ICLR-Paying more attention to attention: Improving the performance of convolutional neural networksvia attention transfer
- 2017-ICLR-Do deep convolutional nets really need to be deep and convolutional?
- 2017-CVPR-A gift from knowledge distillation: Fast optimization, network minimization and transfer learning
- 2017-BMVC-Adapting models to signal degradation using distillation
- 2017-NIPS-Sobolev training for neural networks
- 2017-NIPS-Learning efficient object detection models with knowledge distillation
- 2017-NIPSw-Data-Free Knowledge Distillation for Deep Neural Networks [Code]
- 2017.07-Like What You Like: Knowledge Distill via Neuron Selectivity Transfer
- 2017.10-Knowledge Projection for Deep Neural Networks
- 2017.11-Distilling a Neural Network Into a Soft Decision Tree
- 2017.12-Data Distillation: Towards Omni-Supervised Learning
2018
- 2018-AAAI-DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer
- 2018-AAAI-Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution
- 2018-AAAI-Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net
- 2018-AAAI-Adversarial Learning of Portable Student Networks
- 2018-AAAI-Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students
- 2018-ICLR-Large scale distributed neural network training through online distillation
- 2018-CVPR-Deep mutual learning
- 2018-ICML-Born-Again Neural Networks
- 2018-IJCAI-Better and Faster: Knowledge Transfer from Multiple Self-supervised Learning Tasks via Graph Distillation for Video Classification
- 2018-ECCV-2018-ECCV-Learning deep representations with probabilistic knowledge transfer [Code]
- 2018-ECCV-Graph adaptive knowledge transfer for unsupervised domain adaptation
- 2018-SIGKDD-Towards Evolutionary Compression
- 2018-NIPS-KDGAN: knowledge distillation with generative adversarial networks [2019 TPAMI version]
- 2018-NIPS-Knowledge Distillation by On-the-Fly Native Ensemble
- 2018-NIPS-Paraphrasing Complex Network: Network Compression via Factor Transfer
- 2018-NIPSw-Variational Mutual Information Distillation for Transfer Learning workshop: continual learning
- 2018-NIPSw-Transparent Model Distillation
- 2018.03-Interpreting Deep Classifier by Visual Distillation of Dark Knowledge
- 2018.11-Dataset Distillation [Code]
- 2018.12-Learning Student Networks via Feature Embedding
- 2018.12-Few Sample Knowledge Distillation for Efficient Network Compression
2019
- 2019-AAAI-Knowledge Distillation with Adversarial Samples Supporting Decision Boundary
- 2019-AAAI-Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons [Code]
- 2019-AAAI-Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks [Code]
- 2019-CVPR-Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation
- 2019-CVPR-Knowledge Distillation via Instance Relationship Graph
- 2019-CVPR-Variational Information Distillation for Knowledge Transfer
- 2019-CVPR-Learning Metrics from Teachers Compact Networks for Image Embedding [Code]
- 2019-ICCV-A Comprehensive Overhaul of Feature Distillation
- 2019-ICCV-Similarity-Preserving Knowledge Distillation
- 2019-ICCV-Correlation Congruence for Knowledge Distillation
- 2019-ICCV-Data-Free Learning of Student Networks
- 2019-ICCV-Learning Lightweight Lane Detection CNNs by Self Attention Distillation [Code]
- 2019-ICCV-Attention bridging network for knowledge transfer
- 2019-NIPS-Zero-shot Knowledge Transfer via Adversarial Belief Matching [Code] (spotlight)
- 2019.05-DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs
2020
- 2020-ICLR-Contrastive Representation Distillation [Code]
- 2020-AAAI-A Knowledge Transfer Framework for Differentially Private Sparse Learning
- 2020-AAAI-Uncertainty-aware Multi-shot Knowledge Distillation for Image-based Object Re-identification
- 2020-AAAI-Improved Knowledge Distillation via Teacher Assistant
- 2020-AAAI-Knowledge Distillation from Internal Representations
- 2020-AAAI-Distilling Knowledge from Well-informed Soft Labels for Neural Relation Extraction
- 2020-AAAI-Online Knowledge Distillation with Diverse Peers
- 2020-AAAI-Ultrafast Video Attention Prediction with Coupled Knowledge Distillation
- 2020-AAAI-Graph Few-shot Learning via Knowledge Transfer
- 2020-AAAI-Diversity Transfer Network for Few-Shot Learning
- 2020-AAAI-Few Shot Network Compression via Cross Distillation
- 2020-ICLR-Knowledge Consistency between Neural Networks and Beyond
- 2020-ICLR-Contrastive Representation Distillation [Code]
- 2020-ICLR-BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget
- 2020-ICLR-Ensemble Distribution Distillation
- 2020-CVPR-Collaborative Distillation for Ultra-Resolution Universal Style Transfer [Code]
- 2020-CVPR-Explaining Knowledge Distillation by Quantifying the Knowledge
- 2020-CVPR-Self-training with Noisy Student improves ImageNet classification [Code]
- 2020-CVPR-Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model
- 2020-CVPR-Heterogeneous Knowledge Distillation Using Information Flow Modeling
- 2020-CVPR-Creating Something From Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing
- 2020-CVPR-Revisiting Knowledge Distillation via Label Smoothing Regularization
- 2020-CVPR-Distilling Knowledge From Graph Convolutional Networks
- 2020-CVPR-MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images [Code]
- 2020-CVPRo-Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion [Code]
- 2020-CVPR-Online Knowledge Distillation via Collaborative Learning
- 2020-CVPR-Distilling Cross-Task Knowledge via Relationship Matching
- 2020-CVPR-Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN
- 2020-CVPR-Regularizing Class-Wise Predictions via Self-Knowledge Distillation
- 2020-ICML-Feature-map-level Online Adversarial Knowledge Distillation
- 2020-NIPS-Self-Distillation as Instance-Specific Label Smoothing
- 2020-NIPS-Ensemble Distillation for Robust Model Fusion in Federated Learning
- 2020-NIPS-Self-Distillation Amplifies Regularization in Hilbert Space
- 2020-NIPS-MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
- 2020-NIPS-Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts
- 2020-NIPS-Kernel Based Progressive Distillation for Adder Neural Networks
- 2020-NIPS-Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space
- 2020-NIPS-Task-Oriented Feature Distillation
- 2020-NIPS-Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection
- 2020-NIPS-Distributed Distillation for On-Device Learning
- 2020-NIPS-Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
- 2020.12-Knowledge Distillation Thrives on Data Augmentation
- 2020.12-Multi-head Knowledge Distillation for Model Compression
2021
- 2021-AAAI-Cross-Layer Distillation with Semantic Calibration [Code]
- 2021-ICLR-Distilling Knowledge from Reader to Retriever for Question Answering
- 2021-ICLR-Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors
- 2021-ICLR-Knowledge distillation via softmax regression representation learning [Code]
- 2021-ICLR-Knowledge Distillation as Semiparametric Inference
- 2021-ICLR-Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study
- 2021-ICLR-Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective
- 2021-CVPR-Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation [PyTorch Code]
- 2021-CVPR-Complementary Relation Contrastive Distillation
- 2021-CVPR-Distilling Knowledge via Knowledge Review [Code]
- 2021-ICML-KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation
- 2021-ICML-A statistical perspective on distillation
- 2021-ICML-Training data-efficient image transformers & distillation through attention
- 2021-ICML-Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model
- 2021-ICML-Data-Free Knowledge Distillation for Heterogeneous Federated Learning
- 2021-ICML-Simultaneous Similarity-based Self-Distillation for Deep Metric Learning
- 2021-NIPS-Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation [Code]
2022
- 2022-ECCV-R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis Code
- 2022-NIPS-An Analytical Theory of Curriculum Learning in Teacher-Student Networks
Papers [AutoML (NAS etc.)]
- 2016.11-Neural architecture search with reinforcement learning
- 2019-CVPR-Searching for A Robust Neural Architecture in Four GPU Hours [Code]
- 2019-CVPR-FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
- 2019-CVPR-RENAS: Reinforced Evolutionary Neural Architecture Search
- 2019-NIPS-Meta Architecture Search
- 2019-NIPS-SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers
- 2020-NIPS-Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation
- 2020-NIPS-Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search
- 2020-NIPS-Theory-Inspired Path-Regularized Differential Network Architecture Search
- 2020-NIPS-ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding
- 2020-NIPS-Semi-Supervised Neural Architecture Search
- 2020-NIPS-Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS
- 2020-NIPS-Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?
- 2020-NIPS-Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement
- 2020-NIPS-CLEARER: Multi-Scale Neural Architecture Search for Image Restoration
- 2020-NIPS-A Study on Encodings for Neural Architecture Search
- 2020-NIPS-Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation
- 2020-NIPS-Hierarchical Neural Architecture Search for Deep Stereo Matching
Papers [Interpretability]
- 2010-JMLR-How to explain individual classification decisions
- 2015-PLOS ONE-On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation
- 2015-CVPR-Learning to generate chairs with convolutional neural networks
- 2015-CVPR-Understanding deep image representations by inverting them [2016 IJCV version: Visualizing deep convolutional neural networks using natural pre-images]
- 2016-CVPR-Inverting Visual Representations with Convolutional Networks
- 2016-KDD-"Why Should I Trust You?": Explaining the Predictions of Any Classifier
- 2016-ICMLw-The Mythos of Model Interpretability
- 2017-NIPSw-The (Un)reliability of saliency methods
- 2017-DSP-Methods for interpreting and understanding deep neural networks
- 2018-ICML-Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors
- 2018-CVPR-Deep Image Prior [Code]
- 2018-NIPSs-Sanity Checks for Saliency Maps
- 2018-NIPSs-Human-in-the-Loop Interpretability Prior
- 2018-NIPS-To Trust Or Not To Trust A Classifier [Code]
- 2019-AISTATS-Interpreting Black Box Predictions using Fisher Kernels
- 2019.05-Luck Matters: Understanding Training Dynamics of Deep ReLU Networks
- 2019.05-Adversarial Examples Are Not Bugs, They Are Features
- 2019.06-The Generalization-Stability Tradeoff in Neural Network Pruning
- 2019.06-One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers
- 2019-Book-Interpretable Machine Learning
Workshops
- 2017-ICML Tutorial: interpretable machine learning
- 2018-ICML Workshop: Efficient Credit Assignment in Deep Learning and Reinforcement Learning
- CDNNRIA Workshop (Compact Deep Neural Network Representation with Industrial Applications): 1st-2018-NIPSw, 2nd-2019-ICMLw
- LLD Workshop (Learning with Limited Data): 1st-2017-NIPSw, 2nd-2019-ICLRw
- WHI (Worshop on Human Interpretability in Machine Learning): 1st-2016-ICMLw, 2nd-2017-ICMLw, 3rd-2018-ICMLw
- NIPS-18 Workshop on Systems for ML and Open Source Software
- MLPCD Workshop (Machine Learning on the Phone and other Consumer Devices): 2nd-2018-NIPSw
- Workshop on Bayesian Deep Learning
- 2020 CVPR Workshop on NAS
Books & Courses
- TinyML and Efficient Deep Learning @MIT by Prof. Song Han
Lightweight DNN Engines/APIs
- NNPACK
- DMLC: Tensor Virtual Machine (TVM): Open Deep Learning Compiler Stack
- Tencent: NCNN
- Xiaomi: MACE, Mobile AI Benchmark
- Alibaba: MNN blog (in Chinese)
- Baidu: Paddle-Slim, Paddle-Mobile, Anakin
- Microsoft: ELL, AutoML tool NNI
- Facebook: Caffe2/PyTorch
- Apple: CoreML (iOS 11+)
- Google: ML-Kit, NNAPI (Android 8.1+), TF-Lite
- Qualcomm: Snapdragon Neural Processing Engine (SNPE), Adreno GPU SDK
- Huawei: HiAI
- ARM: Tengine
- Related: DAWNBench: An End-to-End Deep Learning Benchmark and Competition
Related Repos and Websites
- Awesome-NAS
- Awesome-Pruning
- Awesome-Knowledge-Distillation
- MS AI-System open course
- caffe-int8-convert-tools
- Neural-Networks-on-Silicon
- Embedded-Neural-Network
- model_compression
- model-compression (in Chinese)
- Efficient-Segmentation-Networks
- AutoML NAS Literature
- Papers with code
- ImageNet Benckmark
- Self-supervised ImageNet Benckmark
- NVIDIA Blog with Sparsity Tag