Discover IST-DASLab/sparse-imagenet-transfer Open Source project

Stars
8
Rank 2,099,232 (Top 42 %)
Language
Python
License
Apache License 2.0
Created over 2 years ago
Updated over 2 years ago

IST-DASLab

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Code for reproducing the results in "How Well do Sparse Imagenet Models Transfer?", presented at CVPR 2022

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python

1,889

sparsegpt

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Python

694

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python

557

qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

PanzaMail

QUIK

Repository for the QUIK project, enabling the use of 4bit kernels for generative inference

C++

169

OBC

Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".

Python

WoodFisher

Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)

Python

Sparse-Marlin

Boosting 4-bit inference kernels with 2:4 Sparsity

Cuda

SparseFinetuning

Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry

Python

RoSA

Python

QIGen

Repository for CPU Kernel Generation for LLM Inference

Python

ACDC

Code for reproducing "AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks" (NeurIPS 2021)

Python

spdy

Code for ICML 2022 paper "SPDY: Accurate Pruning with Speedup Guarantees"

Python

M-FAC

Efficient reference implementations of the static & dynamic M-FAC algorithms (for pruning and optimization)

Python

torch_cgx

Pytorch distributed backend extension with compression support

C++

sparseprop

C++

peft-rosa

A fork of the PEFT library, supporting Robust Adaptation (RoSA)

Python

MicroAdam

This repository contains code for the MicroAdam paper.

Python

CrAM

Code for reproducing the results from "CrAM: A Compression-Aware Minimizer" accepted at ICLR 2023

Python

spops

C++

ISTA-DASLab-Optimizers

Python

EFCP

The repository contains code to reproduce the experiments from our paper Error Feedback Can Accurately Compress Preconditioners available below:

Python

pruned-vision-model-bias

Code for reproducing the paper "Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures"

Jupyter Notebook

Mathador-LM

Code for the paper "Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on LLMs".

Python

CAP

Repository for Correlation Aware Prune (NeurIPS23) source and experimental code

Python

evolution-strategies

Python

TACO4NLP

Task aware compression for various NLP tasks

Python

smart-quantizer

Repository for Vitaly's implementation of the distribution-adaptive quantizer

Python

ZipLM

Code for the NeurIPS 2023 paper: "ZipLM: Inference-Aware Structured Pruning of Language Models".

QRGD

Repository for the implementation of "Distributed Principal Component Analysis with Limited Communication" (Alimisis et al., NeurIPS 2021). Parts of this code were originally based on code from "Communication-Efficient Distributed PCA by Riemannian Optimization" (Huang and Pan, ICML 2020).

MATLAB

KDVR

Code for the experiments in Knowledge Distillation Performs Partial Variance Reduction, NeurIPS 2023

Python

GridSearcher

GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.

Python

IST-DASLab/sparse-imagenet-transfer

IST-DASLab

Reviews

Repository Details

More Repositories