There are no reviews yet. Be the first to send feedback to the community and the maintainers!
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decodingSWCaffe
A Deep Learning Framework customized for Sunway TaihuLightDistributed-ResNet-Tensorflow
A Distributed ResNet on multi-machines each with one GPU card.swGEMM
A highly efficient library for GEMM operations on Sunway TaihuLightswDNN
a highly-efficient library for deep neural networks based on Sunway TaihuLight supercomputer.PSTensor
PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.PyTorchMemTracer
Depict GPU memory footprint during DNN training of PyTorchChituAttention
Quantized Attention on GPUColoBloom
intel-baidu-allreduce
DeepSpeedZeRO3Benchmark
A finetuned benchmark scripts for DeepSpeed zero3 stageswDNNv1.0
A Deep Learning Library for Sunway TaihuLightssh-passwd-free
Method to set passwd-free for a set of IPsTensorrtBenchmark
Benchmark bert using TensorRTSMO-SVM
a python implementation of libsvmcudaMemHook
horovod-resnet
Communication-Efficient-DNN
DiTKVAnalysis
An auxiliary project analysis of the characteristics of KV in DiT Attention.89757
DeepGlobe
DTensor
Study PyTorch DTensorMoE-Megatron-LM
large-scale-tensorflow-benchmark
benchmark tensorflow for supercomputersProjectRun
CommTest
Test for PyTorch Async Collective CommunicationColossalAI_bert_inference
ckp_training
ADMM-NeuralNetwork
ADMM-NeuralNetwork was implemented by a potatoLove Open Source and this site? Check out how you can help us