• Stars
    star
    188
  • Rank 205,563 (Top 5 %)
  • Language
    Go
  • License
    Other
  • Created almost 5 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

License Go Report Card Build Status Coverage Status

k8s-rdma-shared-dev-plugin

(https://hub.docker.com/r/mellanox/k8s-rdma-shared-dev-plugin)

This is simple rdma device plugin that support IB and RoCE HCA. This plugin runs as daemonset. Its container image is available at mellanox/k8s-rdma-shared-dev-plugin.

How to use device plugin

1. Use CNI plugin such as Contiv, Calico, Cluster

Make sure to configure ib0 or appropriate IPoIB netdevice as the parent netdevice for creating overlay/virtual netdevices.

2. Create ConfigMap and deploy Device Plugin

Deploy device plugin and create config map to describe mode as "hca" mode. This is per node configuration.

cd deployment/k8s/base
kubectl apply -k .

3. Create Test pod

Create test pod which requests 1 vhca resource.

kubectl create -f example/test-hca-pod.yaml

Deploy the device plugin with CDI support

To use the device plugin with CDI support, do the following:

cd deployment/k8s/base/overlay
kubectl apply -k .

How to use device plugin for RDMA

The device plugin can be used with macvlan for RDMA, to do the following steps:

1. use macvlan cni

# cat > /etc/cni/net.d/00-macvlan.conf <<EOF
{
    "cniVersion": "0.3.1",
    "name": "mynet",
    "type": "macvlan",
     "master": "enp0s0f0",
        "ipam": {
                "type": "host-local",
                "subnet": "10.56.217.0/24",
                "rangeStart": "10.56.217.171",
                "rangeEnd": "10.56.217.181",
                "routes": [
                        { "dst": "0.0.0.0/0" }
                ],
                "gateway": "10.56.217.1"
        }
}

EOF

2. Follow the steps in the previous section to deploy the device plugin

3. Deploy RDMA pod application

kubectl create -f <rdma-app.yaml>

RDMA Shared Device Plugin Configurations

The plugin has several configuration fields, this section explains each field usage

{
  "periodicUpdateInterval": 300,
  "configList": [{
      "resourceName": "hca_shared_devices_a",
      "resourcePrefix": "example_prefix",
      "rdmaHcaMax": 1000,
      "devices": ["ib0", "ib1"]
    },
    {
      "resourceName": "hca_shared_devices_b",
      "rdmaHcaMax": 500,
      "selectors": {
        "vendors": ["15b3"],
        "deviceIDs": ["1017"],
        "ifNames": ["ib3", "ib4"]
      }
    }
  ]
}

periodicUpdateInterval is the time interval in seconds to update the resources according to host devices in case of changes. Notes:

  • if periodicUpdateInterval is 0 then periodic update for host devices will be disabled.
  • if periodicUpdateInterval is not set then default periodic update interval of 60 seconds will be used.

"configList" should contain a list of config objects. Each config object may consist of following fields:

Field Required Description Type Default value Example
"resourceName" Y Endpoint resource name. Should not contain special characters, must be unique in the scope of the resource prefix string - "hca_shared_devices_a"
"resourcePrefix" N Endpoint resource prefix. Should not contain special characters string "rdma" "example_prefix"
"rdmaHcaMax" Y Maximum number of RDMA resources that can be provided by the device plugin resource Integer - 1000
"selectors" N A map of device selectors for filtering the devices. refer to Device Selectors section for more information json object - selectors": {"vendors": ["15b3"],"deviceIDs": ["1017"]}
"devices" N A list of devices names to be selected, same as "ifNames" selector string list - ["ib0", "ib1"]

Note: Either selectors or devices must be specified for a given resource, "selectors" is recommended.

Devices Selectors

The following selectors are used for filtering the desired devices.

Field Description Type Example
"vendors" Target device's vendor Hex code as string string list "vendors": ["15b3"]
"deviceIDs" Target device's device Hex code as string string list "devices": ["1017"]
"drivers" Target device driver names as string string list "drivers": ["mlx5_core"]
"ifNames" Target device name string list "ifNames": ["enp2s2f0"]
"linkTypes" The link type of the net device associated with the PCI device string list "linkTypes": ["ether"]

Selectors Matching Process

The device plugin filters the host devices based on the provided selectors, if there are any missing selectors, the device plugin ignores them. Device plugin performs logical OR between elements of a specific selector and logical AND is performed between selectors.

RDMA shared device plugin deployment with node labels

RDMA shared device plugin should be deployed on nodes that:

  1. Have RDMA capable hardware
  2. RDMA kernel stack is loaded

To allow proper node selection Node Feature Discovery (NFD) can be used to discover the node capabilities, and expose them as node labels.

  1. Deploy NFD, release v0.6.0 or new newer
# export NFD_VERSION=v0.6.0
# kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/$NFD_VERSION/nfd-master.yaml.template
# kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/$NFD_VERSION/nfd-worker-daemonset.yaml.template
  1. Check the new labels added to the node
# kubectl get nodes --show-labels

RDMA device plugin can then be deployed on nodes with feature.node.kubernetes.io/custom-rdma.available=true, which indicates that the node is RDMA capable and RDMA modules are loaded.

Docker image

RDMA shared device plugin uses alpine base image by default. To build RDMA shared device plugin with another base image you need to pass BASE_IMAGE argument:

docker build -t k8s-rdma-shared-dev-plugin \
--build-arg BASE_IMAGE=registry.access.redhat.com/ubi8/ubi-minimal:latest \
.

Note: Building image with alpine v3.14.x requires Docker 20.10.0 or newer. for more information refer to Alpine 3.14.0 Release Notes

More Repositories

1

libvma

Linux user space library for network socket acceleration based on RDMA compatible network adaptors
C++
581
star
2

sockperf

Network Benchmarking Utility
C++
567
star
3

SparkRDMA

This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx
Java
240
star
4

nv_peer_memory

C
234
star
5

network-operator

Mellanox Network Operator
Go
207
star
6

mlxsw

C
167
star
7

nccl-rdma-sharp-plugins

RDMA and SHARP plugins for nccl library
C
157
star
8

mstflint

Mstflint - an open source version of MFT (Mellanox Firmware Tools)
C
114
star
9

k8s-rdma-sriov-dev-plugin

Kubernetes Rdma SRIOV device plugin
Go
110
star
10

gpu_direct_rdma_access

example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory
C
99
star
11

mlnx-tools

Mellanox userland tools and scripts
Python
95
star
12

docker-sriov-plugin

Docker networking plugin for SRIOV and passthrough interfaces
Go
76
star
13

rdmamap

RDMA library for mapping associate netdevice and character devices
Go
58
star
14

ib-kubernetes

Go
57
star
15

libxlio

C++
41
star
16

ofed-docker

Shell
41
star
17

linux-sysinfo-snapshot

Linux Sysinfo Snapshot
Python
39
star
18

scalablefunctions

All about Scalable functions
39
star
19

SAI-Implementation

This repository contains SAI implementation for Mellanox hardware.
C
37
star
20

SAI-P4-BM

C++
36
star
21

SwitchRouterSDK-interfaces

C
32
star
22

mkt

Mellanox Kernel developers Toolset (MKT)
Python
25
star
23

mlx_steering_dump

Mellanox Steering Dump Tool for SWS and HWS acceleration
Python
24
star
24

ovs-tests

A collection of tests for the Open vSwitch HW offload.
Shell
23
star
25

R4H

RDMA for HDFS
Java
23
star
26

bfb-build

BFB (BlueField boot stream and OS installer) build environment
Shell
22
star
27

ufm_sdk_3.0

Python
19
star
28

ibdump

C
19
star
29

scapy-ui

Scapy UI - Web based scapy tools
Python
18
star
30

DCTrafficGen

Data Center Traffic Generator Library
C++
17
star
31

rshim-user-space

Linux based user-space RSHIM driver for the Mellanox BlueField SoC
C
17
star
32

vnf_acceleration_example

C
16
star
33

nvidia-k8s-ipam

IPAM plugin for kubernetes
Go
14
star
34

hw_offload_api_examples

Examples of usage for Mellanox HW offloads
C
14
star
35

pcx

Persistent Collectives X- A collective communication library for high performance, low cost persistent collectives over RDMA devices.
C++
13
star
36

hw-mgmt

Shell
13
star
37

rshim

BlueField RSHIM driver
C
12
star
38

rdma_fc

Demonstration of flow control over RDMA fabric
C
11
star
39

ngc_multinode_perf

Performance tests for multinode NGC.Ready certification
Shell
11
star
40

ipoib-cni

IP Over Infiniband (IPoIB) CNI Plugin
Go
11
star
41

pka

Mellanox BlueField PKA support
C
11
star
42

UDA

Unstructured Data Accelerator (RDMA) for Hadoop MapReduce
C++
10
star
43

mlxdevm-go

mlxdevm library for for device management in go language
Go
10
star
44

bfscripts

Collection of scripts used for BlueField SoC system management.
Shell
10
star
45

k8s-images

Dockerfile
10
star
46

devx

Objective-C
9
star
47

MT.ComB

Multi-Threaded (MT) Communication Benchmark
C
8
star
48

container_tools

Few useful container orchestration, deployment tools when using RDMA
Go
8
star
49

kubernetes-ci

CI for Kubernetes with Mellanox features
Shell
8
star
50

libpsample

C
8
star
51

config-tools

Mellanox Configuration tool for Linux Host
Shell
7
star
52

tls-af_ktls_tool

C
7
star
53

tls-offload

C
7
star
54

OVS

C
7
star
55

EC

!!! NOTICE: DEPRECATED !!! Java Erasure Coding NIC Offload library. For the C level EC offloads, use MLNX_OFED libraries and documentation.
C
6
star
56

TFDeploy

TensorFlow deploy script to easily run on multiple servers
Python
6
star
57

NVMEoF-P2P

A fork of the Linux kernel for NVMEoF target driver using PCI P2P capabilities for full I/O path offloading.
C
6
star
58

napalm

Network Automation and Programmability Abstraction Layer with Multivendor support
Python
5
star
59

containerized-ovs-forwarder

Python
5
star
60

bluefield-linux

Linux kernel to support Mellanox BlueField SoCs
C
5
star
61

kmtracker

Linux Kernel memory tracker
Go
5
star
62

bf-release

BlueField release files, configuration files and post-installation steps
Python
5
star
63

mofed_dockerfiles

MOFED Docker files
Roff
5
star
64

docker-nmos-cpp

Shell
4
star
65

wjh-linux

Python
4
star
66

ALVS

C
4
star
67

Switch-SDK-drivers

Switch SDK Driver
C
4
star
68

container_scripts

Some container scripts
Shell
4
star
69

ipmb-host

IPMB driver to send requests from the BlueField to the BMC on CentOS
C
4
star
70

mlnx_lib

C
4
star
71

nic-configuration-operator

Nvidia Networking NIC Configuration Operator For Kubernetes
Go
4
star
72

mellanox-netdev-stdlib-mlnxos

MLNX_OS specific Provider code for "netdev-stdlib". Netdev provides a set of network resource abstractions for automating network device configuration using Puppet
Ruby
4
star
73

libmlxdevm

Mellanox device management C library
C
3
star
74

virtio-emulation

C
3
star
75

dpdk-mlx4

DPDK.org tree with enhanced librte_pmd_mlx4
Objective-C
3
star
76

sai_p4_compiler

C++
3
star
77

mlnx-project-config

Python
3
star
78

ATC

C
3
star
79

regex

C
3
star
80

network-operator-docs

NVIDIA Network Operator documentation sources
PowerShell
3
star
81

nic-kernel

Nvidia NBU integration kernel
C
3
star
82

DPDK-18.11-for-Ubuntu-18.04

C
3
star
83

nagios4mlnxos

Nagios Plugin for Mellanox's Switches
Perl
3
star
84

meta-bluefield

Shell
3
star
85

NNT-Linux-driver

NNT Linux driver for MFT & MSTFLINT packages
C
3
star
86

OpenAI.recipe

Recommended configuration for large-scale setup - OpenAI
2
star
87

Kubespray-role-for-RDMA-shared-DP

2
star
88

ci-demo

Groovy
2
star
89

libdpcp

C++
2
star
90

doca-driver-build

Shell
2
star
91

ceilometer_sriov_counters

Plugin for Ceilometer SRIOV traffic counters
Python
2
star
92

mlnx-openstack

Puppet manifests for deploying Mellanox OpenStack plugins
Puppet
2
star
93

QAT_Engine

C
2
star
94

iproute2

2
star
95

nic-feature-discovery

NVIDIA NIC feature discovery
Go
2
star
96

eswitchd

Python
2
star
97

mlx-strongswan

Mellanox version of strongswan cloned from strongswan-5.9.0.tar.gz
C
2
star
98

nginx_automation

This is simple Python automation for Nginx - VMA related activity
Python
2
star
99

ipsec-offload

2
star
100

dpdk-utest

Rust
2
star