• Stars
    star
    9
  • Rank 1,939,727 (Top 39 %)
  • Language
    C++
  • License
    Other
  • Created about 3 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.

More Repositories

1

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding
Python
305
star
2

SWCaffe

A Deep Learning Framework customized for Sunway TaihuLight
C++
39
star
3

Distributed-ResNet-Tensorflow

A Distributed ResNet on multi-machines each with one GPU card.
Python
20
star
4

swGEMM

A highly efficient library for GEMM operations on Sunway TaihuLight
C
14
star
5

swDNN

a highly-efficient library for deep neural networks based on Sunway TaihuLight supercomputer.
Roff
14
star
6

PyTorchMemTracer

Depict GPU memory footprint during DNN training of PyTorch
Python
9
star
7

ChituAttention

Quantized Attention on GPU
Python
8
star
8

ColoBloom

Python
5
star
9

intel-baidu-allreduce

C++
5
star
10

DeepSpeedZeRO3Benchmark

A finetuned benchmark scripts for DeepSpeed zero3 stage
Python
5
star
11

swDNNv1.0

A Deep Learning Library for Sunway TaihuLight
C
4
star
12

crack_leetcode

五天刷题,三天模拟!快速掌握leetcode解题套路!
C++
4
star
13

ssh-passwd-free

Method to set passwd-free for a set of IPs
Shell
3
star
14

TensorrtBenchmark

Benchmark bert using TensorRT
C++
3
star
15

SMO-SVM

a python implementation of libsvm
Perl
3
star
16

cudaMemHook

C++
3
star
17

horovod-resnet

Python
3
star
18

Communication-Efficient-DNN

Python
3
star
19

DiTKVAnalysis

An auxiliary project analysis of the characteristics of KV in DiT Attention.
Python
2
star
20

89757

Python
2
star
21

DeepGlobe

Python
2
star
22

DTensor

Study PyTorch DTensor
Python
2
star
23

MoE-Megatron-LM

Python
2
star
24

large-scale-tensorflow-benchmark

benchmark tensorflow for supercomputers
Jupyter Notebook
2
star
25

ProjectRun

1
star
26

CommTest

Test for PyTorch Async Collective Communication
Python
1
star
27

ColossalAI_bert_inference

Python
1
star
28

ckp_training

Python
1
star
29

ADMM-NeuralNetwork

ADMM-NeuralNetwork was implemented by a potato
MATLAB
1
star