• Stars
    star
    154
  • Rank 241,288 (Top 5 %)
  • Language
    C
  • License
    BSD 3-Clause "New...
  • Created about 5 years ago
  • Updated 19 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

RDMA and SHARP plugins for nccl library

nccl-rdma-sharp-plugins

nccl-rdma-sharp plugin enables RDMA and Switch based collectives(SHARP) with NVIDIA's NCCL library

Overview

Requirements

  • MOFED
  • CUDA
  • SHARP
  • NCCL
  • GPUDirectRDMA plugin

Build Instructions

build system requirements

  • CUDA
  • SHARP
  • MOFED

Plugin uses GNU autotools for its build system. You can build it as follows:

$ ./autogen.sh
$ ./configure
$ make
$ make install

The following flags enabled to build with custom dependencies

  --with-verbs=PATH       Path to non-standard libibverbs installation
  --with-sharp=PATH       Path to non-standard SHARP installation
  --with-cuda=PATH        Path to non-standard CUDA installation

More Repositories

1

sockperf

Network Benchmarking Utility
C++
567
star
2

libvma

Linux user space library for network socket acceleration based on RDMA compatible network adaptors
C++
559
star
3

SparkRDMA

This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx
Java
240
star
4

nv_peer_memory

C
234
star
5

network-operator

Mellanox Network Operator
Go
202
star
6

k8s-rdma-shared-dev-plugin

Go
188
star
7

mlxsw

C
167
star
8

mstflint

Mstflint - an open source version of MFT (Mellanox Firmware Tools)
C
114
star
9

k8s-rdma-sriov-dev-plugin

Kubernetes Rdma SRIOV device plugin
Go
109
star
10

gpu_direct_rdma_access

example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory
C
99
star
11

mlnx-tools

Mellanox userland tools and scripts
Python
95
star
12

docker-sriov-plugin

Docker networking plugin for SRIOV and passthrough interfaces
Go
76
star
13

rdmamap

RDMA library for mapping associate netdevice and character devices
Go
58
star
14

ib-kubernetes

Go
53
star
15

ofed-docker

Shell
41
star
16

scalablefunctions

All about Scalable functions
39
star
17

SAI-Implementation

This repository contains SAI implementation for Mellanox hardware.
C
37
star
18

SAI-P4-BM

C++
36
star
19

linux-sysinfo-snapshot

Linux Sysinfo Snapshot
Python
34
star
20

SwitchRouterSDK-interfaces

C
32
star
21

libxlio

C++
31
star
22

mkt

Mellanox Kernel developers Toolset (MKT)
Python
25
star
23

mlx_steering_dump

Mellanox Steering Dump Tool for SWS and HWS acceleration
Python
24
star
24

ovs-tests

A collection of tests for the Open vSwitch HW offload.
Shell
23
star
25

R4H

RDMA for HDFS
Java
23
star
26

bfb-build

BFB (BlueField boot stream and OS installer) build environment
Shell
22
star
27

ibdump

C
19
star
28

ufm_sdk_3.0

Python
18
star
29

DCTrafficGen

Data Center Traffic Generator Library
C++
17
star
30

scapy-ui

Scapy UI - Web based scapy tools
Python
17
star
31

rshim-user-space

Linux based user-space RSHIM driver for the Mellanox BlueField SoC
C
17
star
32

hw_offload_api_examples

Examples of usage for Mellanox HW offloads
C
14
star
33

vnf_acceleration_example

C
14
star
34

nvidia-k8s-ipam

IPAM plugin for kubernetes
Go
13
star
35

pcx

Persistent Collectives X- A collective communication library for high performance, low cost persistent collectives over RDMA devices.
C++
13
star
36

hw-mgmt

Shell
13
star
37

rshim

BlueField RSHIM driver
C
12
star
38

rdma_fc

Demonstration of flow control over RDMA fabric
C
12
star
39

ngc_multinode_perf

Performance tests for multinode NGC.Ready certification
Shell
11
star
40

ipoib-cni

IP Over Infiniband (IPoIB) CNI Plugin
Go
11
star
41

pka

Mellanox BlueField PKA support
C
11
star
42

UDA

Unstructured Data Accelerator (RDMA) for Hadoop MapReduce
C++
10
star
43

mlxdevm-go

mlxdevm library for for device management in go language
Go
10
star
44

bfscripts

Collection of scripts used for BlueField SoC system management.
Shell
10
star
45

k8s-images

Dockerfile
10
star
46

MT.ComB

Multi-Threaded (MT) Communication Benchmark
C
8
star
47

container_tools

Few useful container orchestration, deployment tools when using RDMA
Go
8
star
48

kubernetes-ci

CI for Kubernetes with Mellanox features
Shell
8
star
49

devx

Objective-C
8
star
50

libpsample

C
8
star
51

config-tools

Mellanox Configuration tool for Linux Host
Shell
7
star
52

tls-af_ktls_tool

C
7
star
53

tls-offload

C
7
star
54

OVS

C
7
star
55

EC

!!! NOTICE: DEPRECATED !!! Java Erasure Coding NIC Offload library. For the C level EC offloads, use MLNX_OFED libraries and documentation.
C
6
star
56

TFDeploy

TensorFlow deploy script to easily run on multiple servers
Python
6
star
57

NVMEoF-P2P

A fork of the Linux kernel for NVMEoF target driver using PCI P2P capabilities for full I/O path offloading.
C
6
star
58

napalm

Network Automation and Programmability Abstraction Layer with Multivendor support
Python
5
star
59

containerized-ovs-forwarder

Python
5
star
60

kmtracker

Linux Kernel memory tracker
Go
5
star
61

bluefield-linux

Linux kernel to support Mellanox BlueField SoCs
C
5
star
62

bf-release

BlueField release files, configuration files and post-installation steps
Python
5
star
63

mofed_dockerfiles

MOFED Docker files
Roff
5
star
64

docker-nmos-cpp

Shell
4
star
65

wjh-linux

Python
4
star
66

ALVS

C
4
star
67

Switch-SDK-drivers

Switch SDK Driver
C
4
star
68

container_scripts

Some container scripts
Shell
4
star
69

ipmb-host

IPMB driver to send requests from the BlueField to the BMC on CentOS
C
4
star
70

mlnx_lib

C
4
star
71

nic-configuration-operator

Nvidia Networking NIC Configuration Operator For Kubernetes
Go
4
star
72

mellanox-netdev-stdlib-mlnxos

MLNX_OS specific Provider code for "netdev-stdlib". Netdev provides a set of network resource abstractions for automating network device configuration using Puppet
Ruby
4
star
73

libmlxdevm

Mellanox device management C library
C
3
star
74

virtio-emulation

C
3
star
75

dpdk-mlx4

DPDK.org tree with enhanced librte_pmd_mlx4
Objective-C
3
star
76

sai_p4_compiler

C++
3
star
77

mlnx-project-config

Python
3
star
78

ATC

C
3
star
79

regex

C
3
star
80

network-operator-docs

NVIDIA Network Operator documentation sources
PowerShell
3
star
81

nic-kernel

Nvidia NBU integration kernel
C
3
star
82

DPDK-18.11-for-Ubuntu-18.04

C
3
star
83

nagios4mlnxos

Nagios Plugin for Mellanox's Switches
Perl
3
star
84

meta-bluefield

Shell
3
star
85

NNT-Linux-driver

NNT Linux driver for MFT & MSTFLINT packages
C
3
star
86

OpenAI.recipe

Recommended configuration for large-scale setup - OpenAI
2
star
87

ci-demo

Groovy
2
star
88

Kubespray-role-for-RDMA-shared-DP

2
star
89

libdpcp

C++
2
star
90

doca-driver-build

Shell
2
star
91

ceilometer_sriov_counters

Plugin for Ceilometer SRIOV traffic counters
Python
2
star
92

mlnx-openstack

Puppet manifests for deploying Mellanox OpenStack plugins
Puppet
2
star
93

QAT_Engine

C
2
star
94

iproute2

2
star
95

nic-feature-discovery

NVIDIA NIC feature discovery
Go
2
star
96

eswitchd

Python
2
star
97

mlx-strongswan

Mellanox version of strongswan cloned from strongswan-5.9.0.tar.gz
C
2
star
98

nginx_automation

This is simple Python automation for Nginx - VMA related activity
Python
2
star
99

ipsec-offload

2
star
100

dpdk-utest

Rust
2
star