nvidia-docker
Build and run Docker containers leveraging NVIDIA GPUsopen-gpu-kernel-modules
NVIDIA Linux open GPU kernel module sourceDeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)FastPhotoStyle
Style transfer, deep learning, feature transformTensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.Megatron-LM
Ongoing research training transformer models at scalevid2vid
Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in PytorchTensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.pix2pixHD
Synthesizing and manipulating 2048x1024 images with conditional GANscuda-samples
Samples for CUDA Developers which demonstrates features in CUDA ToolkitFasterTransformer
Transformer related optimization, including BERT, GPTcutlass
CUDA Templates for Linear Algebra SubroutinesDALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.thrust
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccltacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inferenceDIGITS
Deep Learning GPU Training SystemNeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.nccl
Optimized primitives for collective multi-GPU communicationflownet2-pytorch
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networksk8s-device-plugin
NVIDIA device plugin for KubernetesChatRTX
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLMlibcudacxx
[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/ccclwaveglow
A Flow-based Generative Network for Speech SynthesisMinkowskiEngine
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensorsStable-Diffusion-WebUI-TensorRT
TensorRT Extension for Stable Diffusion Web UInvidia-container-toolkit
Build and run containers leveraging NVIDIA GPUsGenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.semantic-segmentation
Nvidia Semantic Segmentation monorepowarp
A Python framework for high performance GPU simulation and graphicsDeepRecommender
Deep learning for recommender systemscub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/ccclTransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLPstdexec
`std::execution`, the proposed C++ framework for asynchronous and parallel programming.CUDALibrarySamples
CUDA Library SamplesVideoProcessingFramework
Set of Python bindings to C++ libraries which provides full HW acceleration for video decoding, encoding and GPU-accelerated color space and pixel format conversionsgpu-operator
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetesdeepops
Tools for building GPU clustersopen-gpu-doc
Documentation of NVIDIA chip/hardware interfacestrt-samples-for-hackathon-cn
Simple samples for TensorRT programmingQ2RTX
NVIDIA’s implementation of RTX ray-tracing in Quake IIaistore
AIStore: scalable storage for AI applicationspartialconv
A New Padding Scheme: Partial Convolution based PaddingMatX
An efficient C++17 GPU numerical computing library with Python-like syntaxsentiment-discovery
Unsupervised Language Modeling at scale for robust sentiment classificationnvidia-container-runtime
NVIDIA container runtimecccl
CUDA Core Compute Librariesgpu-monitoring-tools
Tools for monitoring NVIDIA GPUs on Linuxretinanet-examples
Fast and accurate object detection with end-to-end GPU optimizationjetson-gpio
A Python library that enables the use of Jetson's GPIOsflowtron
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfermellotron
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training datamodulus
Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methodsgdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technologyspark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUslibnvidia-container
NVIDIA container runtime librarynccl-tests
NCCL Testsdcgm-exporter
NVIDIA GPU metrics exporter for Prometheus leveraging DCGMnv-wavenet
Reference implementation of real-time autoregressive wavenet inferencetensorflow
An Open Source Machine Learning Framework for Everyonecuda-python
CUDA Python Low-level BindingsMAXINE-AR-SDK
NVIDIA AR SDK - API headers and sample applicationsDLSS
NVIDIA DLSS is a new and improved deep learning neural network that boosts frame rates and generates beautiful, sharp images for your gamesnvvl
A library that uses hardware acceleration to load sequences of video frames to facilitate machine learning traininggvdb-voxels
Sparse volume compute and rendering on NVIDIA GPUsBigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)runx
Deep Learning Experiment ManagementNVFlare
NVIDIA Federated Learning Application Runtime Environmentnvcomp
Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.Dataset_Synthesizer
NVIDIA Deep learning Dataset Synthesizer (NDDS)jitify
A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).multi-gpu-programming-models
Examples demonstrating available options to program multiple GPUs in a single node or a clusterlibglvnd
The GL Vendor-Neutral Dispatch libraryenroot
A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.AMGX
Distributed multigrid linear solver library on GPUnvbench
CUDA Kernel Benchmarking LibrarycuCollections
NeMo-Aligner
Scalable toolkit for efficient model alignmentMDL-SDK
NVIDIA Material Definition Language SDKPyProf
A GPU performance profiling tool for PyTorch modelscuda-quantum
C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflowsNeMo-Framework-Launcher
Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.gpu-rest-engine
A REST API for Caffe using Docker and Goframework-reproducibility
Providing reproducibility in deep learning frameworkshpc-container-maker
HPC Container MakerNvPipe
NVIDIA-accelerated zero latency video compression library for interactive remoting applicationsDCGM
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUsNeMo-Curator
Scalable toolkit for data curationcuQuantum
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samplesdata-science-stack
NVIDIA Data Science stack toolsai-assisted-annotation-client
Client side integration example source code and libraries for AI-Assisted Annotation SDKvideo-sdk-samples
Samples demonstrating how to use various APIs of NVIDIA Video Codec SDKtorch-harmonics
Differentiable spherical harmonic transforms and spherical convolutions in PyTorchnvidia-settings
NVIDIA driver control panelgpu-feature-discovery
GPU plugin to the node feature discovery for Kubernetescnmem
A simple memory manager for CUDA designed to help Deep Learning frameworks manage memoryegl-wayland
The EGLStream-based Wayland external platformradtts
Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.Love Open Source and this site? Check out how you can help us