• Stars
    star
    109
  • Rank 319,077 (Top 7 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Offline Quantization Tools for Deploy.

More Repositories

1

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Python
2,304
star
2

MQBench

Model Quantization Benchmark
Shell
742
star
3

United-Perception

United Perception
Python
427
star
4

llmc

This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Python
184
star
5

awesome-lm-system

Summary of system papers/frameworks/codes/tools on training or serving large model
56
star
6

TFMQ-DM

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
Jupyter Notebook
50
star
7

mqbench-paper

Python
44
star
8

rank_dataset

PyTorch Dataset Rank Dataset
Python
37
star
9

NART

NART = NART is not A RunTime, a deep learning inference framework.
Python
37
star
10

EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
Python
35
star
11

Outlier_Suppression_Plus

Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Python
35
star
12

NNLQP

Python
33
star
13

QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Python
33
star
14

LPCV2021_Winner_Solution

Python
29
star
15

pyvlova

Yet another Polyhedra Compiler for DeepLearning
Python
19
star
16

LPCV_2023_solution

Python
18
star
17

AAAI2023_EAMPD

AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline
13
star
18

Prototype

Python
12
star
19

L2_Compression

Python
11
star
20

OmniBal

Python
9
star
21

msbench

A tool for model sparse based on torch.fx
Python
7
star
22

Imagenet-S

Robustness for real-world system noise
Python
4
star
23

mtc-token-healing

Token healing implementation in Rust
Rust
3
star
24

FCPTS

Python
2
star
25

general-sam

A general suffix automaton implementation in Rust with Python bindings
Rust
2
star
26

statecs

Rust
1
star
27

general-sam-py

Python bindings for general-sam and some utilities
Python
1
star
28

pyrotom

Python Code Hotfix and Refactor on the fly
Python
1
star