• Stars
    star
    210
  • Rank 187,632 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.

MLCommons™ Algorithmic Efficiency


MLCommons Logo

Paper (arXiv) • Installation • Rules • Contributing • License

CI Lint License: Apache 2.0 Code style: yapf


MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models. This repository holds the competition rules and the benchmark code to run it. For a detailed description of the benchmark design, see our paper.

Table of Contents

Installation

You can install this package and dependences in a python virtual environment or use a Docker container (recommended).

TL;DR to install the Jax version for GPU run:

pip3 install -e '.[pytorch_cpu]'
pip3 install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html'
pip3 install -e '.[full]'

TL;DR to install the PyTorch version for GPU run:

pip3 install -e '.[jax_cpu]'
pip3 install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/torch_stable.html'
pip3 install -e '.[full]'

Virtual environment

Note: Python minimum requirement >= 3.8

To set up a virtual enviornment and install this repository

  1. Create new environment, e.g. via conda or virtualenv

     sudo apt-get install python3-venv
     python3 -m venv env
     source env/bin/activate
  2. Clone this repository

    git clone https://github.com/mlcommons/algorithmic-efficiency.git
    cd algorithmic-efficiency
  3. Run pip3 install commands above to install algorithmic_efficiency.

Additional Details You can also install the requirements for individual workloads, e.g. via
pip3 install -e '.[librispeech]'

or all workloads at once via

pip3 install -e '.[full]'

Docker

We recommend using a Docker container to ensure a similar environment to our scoring and testing environments.

Prerequisites for NVIDIA GPU set up: You may have to install the NVIDIA Container Toolkit so that the containers can locate the NVIDIA drivers and GPUs. See instructions here.

Building Docker Image

  1. Clone this repository

    cd ~ && git clone https://github.com/mlcommons/algorithmic-efficiency.git
  2. Build Docker Image

    cd `algorithmic-efficiency/docker`
    docker build -t <docker_image_name> . --build-args framework=<framework>

    The framework flag can be either pytorch, jax or both. The docker_image_name is arbitrary.

Running Docker Container (Interactive)

  1. Run detached Docker Container
    docker run -t -d \
       -v $HOME/data/:/data/ \
       -v $HOME/experiment_runs/:/experiment_runs \
       -v $HOME/experiment_runs/logs:/logs \
       -v $HOME/algorithmic-efficiency:/algorithmic-efficiency \
       --gpus all \
       --ipc=host \
       <docker_image_name> 
    This will print out a container id.
  2. Open a bash terminal
    docker exec -it <container_id> /bin/bash

Running Docker Container (End-to-end)

To run a submission end-to-end in a container see Getting Started Document.

Getting Started

For instructions on developing and scoring your own algorithm in the benchmark see Getting Started Document.

Running a workload

To run a submission directly by running a Docker container, see Getting Started Document.

Alternatively from a your virtual environment or interactively running Docker container submission_runner.py run:

JAX

python3 submission_runner.py \
    --framework=jax \
    --workload=mnist \
    --experiment_dir=$HOME/experiments \
    --experiment_name=my_first_experiment \
    --submission_path=reference_algorithms/development_algorithms/mnist/mnist_jax/submission.py \
    --tuning_search_space=reference_algorithms/development_algorithms/mnist/tuning_search_space.json

Pytorch

python3 submission_runner.py \
    --framework=pytorch \
    --workload=mnist \
    --experiment_dir=$HOME/experiments \
    --experiment_name=my_first_experiment \
    --submission_path=reference_algorithms/development_algorithms/mnist/mnist_pytorch/submission.py \
    --tuning_search_space=reference_algorithms/development_algorithms/mnist/tuning_search_space.json
Using Pytorch DDP (Recommended)

When using multiple GPUs on a single node it is recommended to use PyTorch's distributed data parallel. To do so, simply replace python3 by

torchrun --standalone --nnodes=1 --nproc_per_node=N_GPUS

where N_GPUS is the number of available GPUs on the node. To only see output from the first process, you can run the following to redirect the output from processes 1-7 to a log file:

torchrun --redirects 1:0,2:0,3:0,4:0,5:0,6:0,7:0 --standalone --nnodes=1 --nproc_per_node=8

So the complete command is for example:

torchrun --redirects 1:0,2:0,3:0,4:0,5:0,6:0,7:0 --standalone --nnodes=1 --nproc_per_node=8 \
submission_runner.py \
    --framework=pytorch \
    --workload=mnist \
    --experiment_dir=/home/znado \
    --experiment_name=baseline \
    --submission_path=reference_algorithms/development_algorithms/mnist/mnist_pytorch/submission.py \
    --tuning_search_space=reference_algorithms/development_algorithms/mnist/tuning_search_space.json \

Rules

The rules for the MLCommons Algorithmic Efficency benchmark can be found in the seperate rules document. Suggestions, clarifications and questions can be raised via pull requests.

Contributing

If you are interested in contributing to the work of the working group, feel free to join the weekly meetings, open issues. See our CONTRIBUTING.md for MLCommons contributing guidelines and setup and workflow instructions.

Note on shared data pipelines between JAX and PyTorch

The JAX and PyTorch versions of the Criteo, FastMRI, Librispeech, OGBG, and WMT workloads are using the same TensorFlow input pipelines. Due to differences in how Jax and PyTorch distribute computations across devices, the PyTorch workloads have an additional overhead for these workloads.

Since we use PyTorch's DistributedDataParallel implementation, there is one Python process for each device. Depending on the hardware and the settings of the cluster, running a TensorFlow input pipeline in each Python process can lead to errors, since too many threads are created in each process. See this PR thread for more details. While this issue might not affect all setups, we currently implement a different strategy: we only run the TensorFlow input pipeline in one Python process (with rank == 0), and broadcast the batches to all other devices. This introduces an additional communication overhead for each batch. See the implementation for the WMT workload as an example.

More Repositories

1

training

Reference implementations of MLPerf™ training benchmarks
Python
1,495
star
2

inference

Reference implementations of MLPerf™ inference benchmarks
Python
966
star
3

ck

Collective Knowledge (CK) is an educational community project to learn how to run AI, ML and other emerging workloads in the most efficient and cost-effective way across diverse models, data sets, software and hardware using MLCommons CM (Collective Mind workflow automation framework)
Python
605
star
4

tiny

MLPerf™ Tiny is an ML benchmark suite for extremely low-power systems such as microcontrollers
C++
293
star
5

GaNDLF

A generalizable application framework for segmentation, regression, and classification using PyTorch
Python
154
star
6

mlcube

MLCube® is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.
Python
149
star
7

medperf

An open benchmarking platform for medical artificial intelligence using Federated Evaluation.
Python
144
star
8

peoples-speech

The People’s Speech Dataset
Jupyter Notebook
96
star
9

training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
Python
91
star
10

training_results_v0.7

This repository contains the results and code for the MLPerf™ Training v0.7 benchmark.
Python
58
star
11

inference_results_v0.5

This repository contains the results and code for the MLPerf™ Inference v0.5 benchmark.
C++
56
star
12

modelbench

Run safety benchmarks against AI models and view detailed reports showing how well they performed.
Python
53
star
13

inference_policies

Issues related to MLPerf™ Inference policies, including rules and suggested changes
50
star
14

training_results_v0.6

This repository contains the results and code for the MLPerf™ Training v0.6 benchmark.
Python
42
star
15

croissant

Croissant is a high-level format for machine learning datasets that brings together four rich layers.
Jupyter Notebook
42
star
16

training_results_v0.5

This repository contains the results and code for the MLPerf™ Training v0.5 benchmark.
Python
36
star
17

training_results_v1.0

This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
Python
36
star
18

hpc

Reference implementations of MLPerf™ HPC training benchmarks
Jupyter Notebook
33
star
19

storage

MLPerf™ Storage Benchmark Suite
Shell
33
star
20

inference_results_v1.0

This repository contains the results and code for the MLPerf™ Inference v1.0 benchmark.
C++
31
star
21

mlcube_examples

MLCube® examples
Python
30
star
22

chakra

Repository for MLCommons Chakra schema and tools
Python
30
star
23

mobile_app_open

Mobile App Open
C++
30
star
24

training_results_v2.0

This repository contains the results and code for the MLPerf™ Training v2.0 benchmark.
C++
27
star
25

modelgauge

Make it easy to automatically and uniformly measure the behavior of many AI Systems.
Python
25
star
26

policies

General policies for MLPerf™ including submission rules, coding standards, etc.
Python
24
star
27

training_results_v1.1

This repository contains the results and code for the MLPerf™ Training v1.1 benchmark.
Python
23
star
28

mobile_models

MLPerf™ Mobile models
22
star
29

logging

MLPerf™ logging library
Python
20
star
30

inference_results_v2.1

This repository contains the results and code for the MLPerf™ Inference v2.1 benchmark.
19
star
31

ck-mlops

A collection of portable workflows, automation recipes and components for MLOps in a unified CK format. Note that this repository is outdated - please check the 2nd generation of the CK workflow automation meta-framework with portable MLOps and DevOps components here:
Python
17
star
32

inference_results_v0.7

This repository contains the results and code for the MLPerf™ Inference v0.7 benchmark.
C++
17
star
33

inference_results_v3.0

This repository contains the results and code for the MLPerf™ Inference v3.0 benchmark.
16
star
34

training_results_v2.1

This repository contains the results and code for the MLPerf™ Training v2.1 benchmark.
C++
15
star
35

power-dev

Dev repo for power measurement for the MLPerf™ benchmarks
Python
14
star
36

medical

Medical ML Benchmark
Python
13
star
37

dynabench

Python
12
star
38

training_results_v3.0

This repository contains the results and code for the MLPerf™ Training v3.0 benchmark.
Python
11
star
39

tiny_results_v0.7

This repository contains the results and code for the MLPerf™ Tiny Inference v0.7 benchmark.
C
11
star
40

inference_results_v1.1

This repository contains the results and code for the MLPerf™ Inference v1.1 benchmark.
Python
11
star
41

inference_results_v4.0

This repository contains the results and code for the MLPerf™ Inference v4.0 benchmark.
9
star
42

dataperf

Data Benchmarking
8
star
43

inference_results_v2.0

This repository contains the results and code for the MLPerf™ Inference v2.0 benchmark.
Python
8
star
44

mobile_open

MLPerf Mobile benchmarks
Python
7
star
45

science

https://mlcommons.org/en/groups/research-science/
Jupyter Notebook
7
star
46

tiny_results_v0.5

This repository contains the results and code for the MLPerf™ Tiny Inference v0.5 benchmark.
C++
5
star
47

inference_results_v3.1

This repository contains the results and code for the MLPerf™ Inference v3.1 benchmark.
5
star
48

tiny_results_v1.0

This repository contains the results and code for the MLPerf™ Tiny Inference v1.0 benchmark.
C
4
star
49

hpc_results_v0.7

This repository contains the results and code for the MLPerf™ HPC Training v0.7 benchmark.
Python
3
star
50

hpc_results_v2.0

This repository contains the results and code for the MLPerf™ HPC Training v2.0 benchmark.
Python
3
star
51

hpc_results_v1.0

This repository contains the results and code for the MLPerf™ HPC Training v1.0 benchmark.
Python
3
star
52

ck-venv

CK automation for virtual environments
Python
2
star
53

cm-mlops

Python
2
star
54

datasets_infra

2
star
55

training_results_v3.1

This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.
Python
1
star
56

research

1
star
57

tiny_results_v1.1

This repository contains the results and code for the MLPerf™ Tiny Inference v1.1 benchmark.
C
1
star
58

medperf-website

JavaScript
1
star
59

mobile_results_v2.1

This repository contains the results and code for the MLPerf™ Mobile Inference v2.1 benchmark.
1
star
60

hpc_results_v3.0

This repository contains the results and code for the MLPerf™ HPC Training v3.0 benchmark.
Python
1
star
61

ck_mlperf_results

Aggregated benchmarking results from MLPerf Inference, Tiny and Training in the MLCommons CM format for the Collective Knowledge Playground. Our goal is to make it easier for the community to visualize, compare and reproduce MLPerf results and add derived metrics such as Performance/Watt or Performance/$
Python
1
star