• Stars
    star
    571
  • Rank 78,127 (Top 2 %)
  • Language
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

LLM papers I'm reading, mostly on inference and model compression

Just helping myself keep track of LLM papers that Iā€˜m reading, with an emphasis on inference and model compression.

Transformer Architectures

Foundation Models

Position Encoding

KV Cache

Activation

Pruning

Quantization

Normalization

Sparsity and rank compression

Fine-tuning

Sampling

Scaling

Mixture of Experts

Watermarking

More