• Stars
    star
    9
  • Rank 1,939,727 (Top 39 %)
  • Language
    Python
  • Created almost 3 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Depict GPU memory footprint during DNN training of PyTorch

More Repositories

1

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding
Python
305
star
2

SWCaffe

A Deep Learning Framework customized for Sunway TaihuLight
C++
39
star
3

Distributed-ResNet-Tensorflow

A Distributed ResNet on multi-machines each with one GPU card.
Python
20
star
4

swGEMM

A highly efficient library for GEMM operations on Sunway TaihuLight
C
14
star
5

swDNN

a highly-efficient library for deep neural networks based on Sunway TaihuLight supercomputer.
Roff
14
star
6

PSTensor

PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.
C++
9
star
7

ChituAttention

Quantized Attention on GPU
Python
8
star
8

ColoBloom

Python
5
star
9

intel-baidu-allreduce

C++
5
star
10

DeepSpeedZeRO3Benchmark

A finetuned benchmark scripts for DeepSpeed zero3 stage
Python
5
star
11

swDNNv1.0

A Deep Learning Library for Sunway TaihuLight
C
4
star
12

crack_leetcode

五天刷题,三天模拟!快速掌握leetcode解题套路!
C++
4
star
13

ssh-passwd-free

Method to set passwd-free for a set of IPs
Shell
3
star
14

TensorrtBenchmark

Benchmark bert using TensorRT
C++
3
star
15

SMO-SVM

a python implementation of libsvm
Perl
3
star
16

cudaMemHook

C++
3
star
17

horovod-resnet

Python
3
star
18

Communication-Efficient-DNN

Python
3
star
19

DiTKVAnalysis

An auxiliary project analysis of the characteristics of KV in DiT Attention.
Python
2
star
20

89757

Python
2
star
21

DeepGlobe

Python
2
star
22

DTensor

Study PyTorch DTensor
Python
2
star
23

MoE-Megatron-LM

Python
2
star
24

large-scale-tensorflow-benchmark

benchmark tensorflow for supercomputers
Jupyter Notebook
2
star
25

ProjectRun

1
star
26

CommTest

Test for PyTorch Async Collective Communication
Python
1
star
27

ColossalAI_bert_inference

Python
1
star
28

ckp_training

Python
1
star
29

ADMM-NeuralNetwork

ADMM-NeuralNetwork was implemented by a potato
MATLAB
1
star