• Stars
    star
    2
  • Language
    Python
  • Created about 2 months ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An auxiliary project analysis of the characteristics of KV in DiT Attention.

More Repositories

1

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding
Python
305
star
2

SWCaffe

A Deep Learning Framework customized for Sunway TaihuLight
C++
39
star
3

Distributed-ResNet-Tensorflow

A Distributed ResNet on multi-machines each with one GPU card.
Python
20
star
4

swGEMM

A highly efficient library for GEMM operations on Sunway TaihuLight
C
14
star
5

swDNN

a highly-efficient library for deep neural networks based on Sunway TaihuLight supercomputer.
Roff
14
star
6

PSTensor

PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.
C++
9
star
7

PyTorchMemTracer

Depict GPU memory footprint during DNN training of PyTorch
Python
9
star
8

ChituAttention

Quantized Attention on GPU
Python
8
star
9

ColoBloom

Python
5
star
10

intel-baidu-allreduce

C++
5
star
11

DeepSpeedZeRO3Benchmark

A finetuned benchmark scripts for DeepSpeed zero3 stage
Python
5
star
12

swDNNv1.0

A Deep Learning Library for Sunway TaihuLight
C
4
star
13

crack_leetcode

五天刷题,三天模拟!快速掌握leetcode解题套路!
C++
4
star
14

ssh-passwd-free

Method to set passwd-free for a set of IPs
Shell
3
star
15

TensorrtBenchmark

Benchmark bert using TensorRT
C++
3
star
16

SMO-SVM

a python implementation of libsvm
Perl
3
star
17

cudaMemHook

C++
3
star
18

horovod-resnet

Python
3
star
19

Communication-Efficient-DNN

Python
3
star
20

89757

Python
2
star
21

DeepGlobe

Python
2
star
22

DTensor

Study PyTorch DTensor
Python
2
star
23

MoE-Megatron-LM

Python
2
star
24

large-scale-tensorflow-benchmark

benchmark tensorflow for supercomputers
Jupyter Notebook
2
star
25

ProjectRun

1
star
26

CommTest

Test for PyTorch Async Collective Communication
Python
1
star
27

ColossalAI_bert_inference

Python
1
star
28

ckp_training

Python
1
star
29

ADMM-NeuralNetwork

ADMM-NeuralNetwork was implemented by a potato
MATLAB
1
star