• Stars
    star
    384
  • Rank 111,726 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Facebook AI Performance Evaluation Platform

Facebook AI Performance Evaluation Platform

Facebook AI Performance Evaluation Platform is a framework and backend agnostic benchmarking platform to compare machine learning inferencing runtime metrics on a set of models and on variety of backends. It also provides a means to check performance regressions on each commit. It is licensed under Apache License 2.0. Please refer to the LICENSE file for details.

Currently, the following performance metrics are collected:

  • Delay : the latency of running the entire network and/or the delay of running each individual operator.
  • Error : the error between the values of the outputs running a model and the golden outputs.
  • Energy/Power : the energy per inference and average power of running the ML model on a phone without battery.
  • Other User Provided Metrics : the harness can accept any metric that the user binary generates.

Framework and backend agnostic benchmarking platforms

Machine learning is a rapidly evolving area with many moving parts: new and existing framework enhancements, new hardware solutions, new software backends, and new models. With so many moving parts, it is very difficult to quickly evaluate the performance of a machine learning model. However, such evaluation is vastly important in guiding resource allocation in:

  • the development of the frameworks
  • the optimization of the software backends
  • the selection of the hardware solutions
  • the iteration of the machine learning models

This project aims to achieve the two following goals:

  • Easily evaluate the runtime performance of a model selected to be benchmarked on all existing backends.
  • Easily evaluate the runtime performance of a backend selected to be benchmarked on all existing models.

The flow of benchmarking is illustrated in the following figure:

Benchmarking flow

The flow is composed of three parts:

  • A centralized model/benchmark specification
    • A fair input to the comparison
  • A centralized benchmark driver with distributed benchmark execution
    • The same code base for all backends to reduce variation
    • Distributed execution due to the unique build/run environment for each backend
  • A centralized data consumption
    • One stop to compare the performance

The currently supported frameworks are: Caffe2, TFLite

The currently supported model formats are: Caffe2, TFLite

The currently supported backends: CPU, GPU, DSP, Android, iOS, Linux based systems

The currently supported libraries: Eigen, MKL, NNPACK, OpenGL, CUDA

Performance regression detection

The benchmark platform also provides a means to compare performance between commits and detect regressions. It uses an A/B testing methodology that compares the runtime difference between a newer commit (treatment) and an older commit (control). The metric of interest is the relative performance difference between the commits, as the backend platform's condition may be different at different times. Running the same tests on two different commit points at the same time removes most of the variations of the backend. This method has been shown to improve the precision of detecting performance regressions.

Directory structure

The benchmarking codebase resides in benchmarking directory. Inside, the frameworks directory contains all supported ML frameworks. A new framework can be added by creating a new directory, deriving from framework_base.py, and implementing all its methods. The platforms directory contains all supported ML backend platforms. A new backend can be added by creating a new directory, deriving from platform_base.py, and implementing all its methods.

The model specifications resides in specifications directory. Inside, the models directory contains all model and benchmarking specifications organized in model format. The benchmarks directory contains a sequence of benchmarks organized in model format. The frameworks directory contains custom build scripts for each framework.

Model/Benchmark specification

The models and benchmarks are specified in json format. It is best to use the example in /specifications/models/caffe2/squeezenet/squeezenet.json as an example to understand what data is specified.

A few key items in the specifications

  • The models are hosted in third party storage. The download links and their MD5 hashes are specified. The benchmarking tool automatically downloads the model if not found in the local model cache. The MD5 hash of the cached model is computed and compared with the specified one. If they do not match, the model is downloaded again and the MD5 hash is recomputed. This way, if the model is changed, only need to update the specification and the new model is downloaded automatically.
  • In the inputs field of tests, one may specify multiple shapes. This is a short hand to indicate that we benchmark the tests of all shapes in sequence.
  • In some field, such as identifier, you may find some string like {ID}. This is a placeholder to be replaced by the benchmarking tool to differentiate multiple test runs specified in one test specification, as in the above item.

Run benchmark

To run the benchmark, you need to run run_bench.py, given a model meta data or a benchmark meta data. An example of the command is the following (when running under FAI-PEP directory):

benchmarking/run_bench.py -b specifications/models/caffe2/shufflenet/shufflenet.json

When you run the command for the first time, you are asked several questions. The answers to those questions, together with other sensible defaults, are saved in a config file: ~/.aibench/git/config.txt. You can edit the file to update your default arguments.

The arguments to the driver are as follows. It also takes arguments specified in the following sections and pass them to those scripts.

usage: run_bench.py [-h] [--reset_options]

Perform one benchmark run

optional arguments:
  -h, --help       show this help message and exit
  --reset_options  Reset all the options that is saved by default.

run_bench.py can be the single point of entry for both interactive and regression benchmark runs.

Stand alone benchmark run

The harness.py is the entry point for one benchmark run. It collects the runtime for an entire net and/or individual operator, and saves the data locally or pushes to a remote server. The usage of the script is as follows:

usage: harness.py [-h] [--android_dir ANDROID_DIR] [--ios_dir IOS_DIR]
                  [--backend BACKEND] -b BENCHMARK_FILE
                  [--command_args COMMAND_ARGS] [--cooldown COOLDOWN]
                  [--device DEVICE] [-d DEVICES]
                  [--excluded_devices EXCLUDED_DEVICES] --framework
                  {caffe2,generic,oculus,tflite} --info INFO
                  [--local_reporter LOCAL_REPORTER]
                  [--monsoon_map MONSOON_MAP]
                  [--simple_local_reporter SIMPLE_LOCAL_REPORTER]
                  --model_cache MODEL_CACHE -p PLATFORM
                  [--platform_sig PLATFORM_SIG] [--program PROGRAM] [--reboot]
                  [--regressed_types REGRESSED_TYPES]
                  [--remote_reporter REMOTE_REPORTER]
                  [--remote_access_token REMOTE_ACCESS_TOKEN]
                  [--root_model_dir ROOT_MODEL_DIR]
                  [--run_type {benchmark,verify,regress}] [--screen_reporter]
                  [--simple_screen_reporter] [--set_freq SET_FREQ]
                  [--shared_libs SHARED_LIBS] [--string_map STRING_MAP]
                  [--timeout TIMEOUT] [--user_identifier USER_IDENTIFIER]
                  [--wipe_cache WIPE_CACHE]
                  [--hash_platform_mapping HASH_PLATFORM_MAPPING]
                  [--user_string USER_STRING]

Perform one benchmark run

optional arguments:
  -h, --help            show this help message and exit
  --android_dir ANDROID_DIR
                        The directory in the android device all files are
                        pushed to.
  --ios_dir IOS_DIR     The directory in the ios device all files are pushed
                        to.
  --backend BACKEND     Specify the backend the test runs on.
  -b BENCHMARK_FILE, --benchmark_file BENCHMARK_FILE
                        Specify the json file for the benchmark or a number of
                        benchmarks
  --command_args COMMAND_ARGS
                        Specify optional command arguments that would go with
                        the main benchmark command
  --cooldown COOLDOWN   Specify the time interval between two test runs.
  --device DEVICE       The single device to run this benchmark on
  -d DEVICES, --devices DEVICES
                        Specify the devices to run the benchmark, in a comma
                        separated list. The value is the device or device_hash
                        field of the meta info.
  --excluded_devices EXCLUDED_DEVICES
                        Specify the devices that skip the benchmark, in a
                        comma separated list. The value is the device or
                        device_hash field of the meta info.
  --framework {caffe2,generic,oculus,tflite}
                        Specify the framework to benchmark on.
  --info INFO           The json serialized options describing the control and
                        treatment.
  --local_reporter LOCAL_REPORTER
                        Save the result to a directory specified by this
                        argument.
  --monsoon_map MONSOON_MAP
                        Map the phone hash to the monsoon serial number.
  --simple_local_reporter SIMPLE_LOCAL_REPORTER
                        Same as local reporter, but the directory hierarchy is
                        reduced.
  --model_cache MODEL_CACHE
                        The local directory containing the cached models. It
                        should not be part of a git directory.
  -p PLATFORM, --platform PLATFORM
                        Specify the platform to benchmark on. Use this flag if
                        the framework needs special compilation scripts. The
                        scripts are called build.sh saved in
                        specifications/frameworks/<framework>/<platform>
                        directory
  --platform_sig PLATFORM_SIG
                        Specify the platform signature
  --program PROGRAM     The program to run on the platform.
  --reboot              Tries to reboot the devices before launching
                        benchmarks for one commit.
  --regressed_types REGRESSED_TYPES
                        A json string that encodes the types of the regressed
                        tests.
  --remote_reporter REMOTE_REPORTER
                        Save the result to a remote server. The style is
                        <domain_name>/<endpoint>|<category>
  --remote_access_token REMOTE_ACCESS_TOKEN
                        The access token to access the remote server
  --root_model_dir ROOT_MODEL_DIR
                        The root model directory if the meta data of the model
                        uses relative directory, i.e. the location field
                        starts with //
  --run_type {benchmark,verify,regress}
                        The type of the current run. The allowed values are:
                        benchmark, the normal benchmark run.verify, the
                        benchmark is re-run to confirm a suspicious
                        regression.regress, the regression is confirmed.
  --screen_reporter     Display the summary of the benchmark result on screen.
  --simple_screen_reporter
                        Display the result on screen with no post processing.
  --set_freq SET_FREQ   On rooted android phones, set the frequency of the
                        cores. The supported values are: max: set all cores to
                        the maximum frquency. min: set all cores to the
                        minimum frequency. mid: set all cores to the median
                        frequency.
  --shared_libs SHARED_LIBS
                        Pass the shared libs that the framework depends on, in
                        a comma separated list.
  --string_map STRING_MAP
                        A json string mapping tokens to replacement strings.
                        The tokens, surrounded by \{\}, when appearing in the
                        test fields of the json file, are to be replaced with
                        the mapped values.
  --timeout TIMEOUT     Specify a timeout running the test on the platforms.
                        The timeout value needs to be large enough so that the
                        low end devices can safely finish the execution in
                        normal conditions. Note, in A/B testing mode, the test
                        runs twice.
  --user_identifier USER_IDENTIFIER
                        User can specify an identifier and that will be passed
                        to the output so that the result can be easily
                        identified.
  --wipe_cache WIPE_CACHE
                        Specify whether to evict cache or not before running
  --hash_platform_mapping HASH_PLATFORM_MAPPING
                        Specify the devices hash platform mapping json file.
  --user_string USER_STRING
                        Specify the user running the test (to be passed to the
                        remote reporter).

Continuous benchmark run

The repo_driver.py is the entry point to run the benchmark continuously. It repeatedly pulls the framework from github, builds the framework, and launches the harness.py with the built benchmarking binaries

The accepted arguments are as follows:

usage: repo_driver.py [-h] [--ab_testing] [--base_commit BASE_COMMIT]
                      [--branch BRANCH] [--commit COMMIT]
                      [--commit_file COMMIT_FILE] --exec_dir EXEC_DIR
                      --framework {caffe2,oculus,generic,tflite}
                      [--frameworks_dir FRAMEWORKS_DIR] [--interval INTERVAL]
                      --platforms PLATFORMS [--regression]
                      [--remote_repository REMOTE_REPOSITORY]
                      [--repo {git,hg}] --repo_dir REPO_DIR [--same_host]
                      [--status_file STATUS_FILE] [--step STEP]

Perform one benchmark run

optional arguments:
  -h, --help            show this help message and exit
  --ab_testing          Enable A/B testing in benchmark.
  --base_commit BASE_COMMIT
                        In A/B testing, this is the control commit that is
                        used to compare against. If not specified, the default
                        is the first commit in the week in UTC timezone. Even
                        if specified, the control is the later of the
                        specified commit and the commit at the start of the
                        week.
  --branch BRANCH       The remote repository branch. Defaults to master
  --commit COMMIT       The commit this benchmark runs on. It can be a branch.
                        Defaults to master. If it is a commit hash, and
                        program runs on continuous mode, it is the starting
                        commit hash the regression runs on. The regression
                        runs on all commits starting from the specified
                        commit.
  --commit_file COMMIT_FILE
                        The file saves the last commit hash that the
                        regression has finished. If this argument is specified
                        and is valid, the --commit has no use.
  --exec_dir EXEC_DIR   The executable is saved in the specified directory. If
                        an executable is found for a commit, no re-compilation
                        is performed. Instead, the previous compiled
                        executable is reused.
  --framework {caffe2,oculus,generic,tflite}
                        Specify the framework to benchmark on.
  --frameworks_dir FRAMEWORKS_DIR
                        Required. The root directory that all frameworks
                        resides. Usually it is the
                        specifications/frameworksdirectory.
  --interval INTERVAL   The minimum time interval in seconds between two
                        benchmark runs.
  --platforms PLATFORMS
                        Specify the platforms to benchmark on, in comma
                        separated list.Use this flag if the framework needs
                        special compilation scripts. The scripts are called
                        build.sh saved in
                        specifications/frameworks/<framework>/<platforms>
                        directory
  --regression          Indicate whether this run detects regression.
  --remote_repository REMOTE_REPOSITORY
                        The remote repository. Defaults to origin
  --repo {git,hg}       Specify the source control repo of the framework.
  --repo_dir REPO_DIR   Required. The base framework repo directory used for
                        benchmark.
  --same_host           Specify whether the build and benchmark run are on the
                        same host. If so, the build cannot be done in parallel
                        with the benchmark run.
  --status_file STATUS_FILE
                        A file to inform the driver stops running when the
                        content of the file is 0.
  --step STEP           Specify the number of commits we want to run the
                        benchmark once under continuous mode.

The repo_driver.py can also take the arguments that are recognized by harness.py. The arguments are passed over.

More Repositories

1

react

The library for web and native user interfaces.
JavaScript
227,971
star
2

react-native

A framework for building native applications using React
C++
118,682
star
3

create-react-app

Set up a modern web app by running one command.
JavaScript
101,913
star
4

docusaurus

Easy to maintain open source documentation websites.
TypeScript
56,059
star
5

jest

Delightful JavaScript Testing.
TypeScript
41,554
star
6

rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
C++
28,328
star
7

folly

An open-source C++ library developed and used at Facebook.
C++
27,122
star
8

zstd

Zstandard - Fast real-time compression algorithm
C
22,448
star
9

flow

Adds static typing to JavaScript to improve developer productivity and code quality.
OCaml
22,068
star
10

lexical

Lexical is an extensible text editor framework that provides excellent reliability, accessibility and performance.
TypeScript
19,616
star
11

relay

Relay is a JavaScript framework for building data-driven React applications.
Rust
18,191
star
12

hhvm

A virtual machine for executing programs written in Hack.
Hack
18,048
star
13

prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Python
17,943
star
14

fresco

An Android library for managing images and the memory they use.
Java
17,041
star
15

yoga

Yoga is an embeddable layout engine targeting web standards.
C++
16,928
star
16

infer

A static analyzer for Java, C, C++, and Objective-C
OCaml
14,715
star
17

flipper

A desktop debugging platform for mobile developers.
TypeScript
13,221
star
18

watchman

Watches files and records, or triggers actions, when they change.
C++
12,294
star
19

react-devtools

An extension that allows inspection of React component hierarchy in the Chrome and Firefox Developer Tools.
11,030
star
20

hermes

A JavaScript engine optimized for running React Native.
C++
9,388
star
21

jscodeshift

A JavaScript codemod toolkit.
JavaScript
9,270
star
22

chisel

Chisel is a collection of LLDB commands to assist debugging iOS apps.
Python
9,090
star
23

buck

A fast build system that encourages the creation of small, reusable modules over a variety of platforms and languages.
Java
8,568
star
24

stylex

StyleX is the styling system for ambitious user interfaces.
JavaScript
8,333
star
25

proxygen

A collection of C++ HTTP libraries including an easy to use HTTP server.
C++
8,026
star
26

facebook-ios-sdk

Used to integrate the Facebook Platform with your iOS & tvOS apps.
Swift
7,720
star
27

litho

A declarative framework for building efficient UIs on Android.
Java
7,646
star
28

pyre-check

Performant type-checking for python.
OCaml
6,696
star
29

facebook-android-sdk

Used to integrate Android apps with Facebook Platform.
Kotlin
6,066
star
30

redex

A bytecode optimizer for Android apps
C++
5,991
star
31

sapling

A Scalable, User-Friendly Source Control System.
Rust
5,815
star
32

componentkit

A React-inspired view framework for iOS.
Objective-C++
5,746
star
33

fishhook

A library that enables dynamically rebinding symbols in Mach-O binaries running on iOS.
C
5,117
star
34

PathPicker

PathPicker accepts a wide range of input -- output from git commands, grep results, searches -- pretty much anything. After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands.
Python
5,075
star
35

metro

🚇 The JavaScript bundler for React Native
JavaScript
5,061
star
36

prop-types

Runtime type checking for React props and similar objects
JavaScript
4,446
star
37

idb

idb is a flexible command line interface for automating iOS simulators and devices
Objective-C
4,431
star
38

Haxl

A Haskell library that simplifies access to remote data, such as databases or web-based services.
Haskell
4,227
star
39

FBRetainCycleDetector

iOS library to help detecting retain cycles in runtime.
Objective-C++
4,190
star
40

memlab

A framework for finding JavaScript memory leaks and analyzing heap snapshots
TypeScript
4,187
star
41

duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Haskell
4,021
star
42

fbt

A JavaScript Internationalization Framework
JavaScript
3,849
star
43

regenerator

Source transformer enabling ECMAScript 6 generator functions in JavaScript-of-today.
JavaScript
3,817
star
44

buck2

Build system, successor to Buck
Rust
3,307
star
45

mcrouter

Mcrouter is a memcached protocol router for scaling memcached deployments.
C++
3,222
star
46

wangle

Wangle is a framework providing a set of common client/server abstractions for building services in a consistent, modular, and composable way.
C++
3,030
star
47

react-strict-dom

React Strict DOM (RSD) is a subset of React DOM, imperative DOM, and CSS that supports web and native targets
JavaScript
2,922
star
48

wdt

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.
C++
2,836
star
49

igl

Intermediate Graphics Library (IGL) is a cross-platform library that commands the GPU. It provides a single low-level cross-platform interface on top of various graphics APIs (e.g. OpenGL, Metal and Vulkan).
C++
2,719
star
50

fbthrift

Facebook's branch of Apache Thrift, including a new C++ server.
C++
2,535
star
51

mysql-5.6

Facebook's branch of the Oracle MySQL database. This includes MyRocks.
C++
2,446
star
52

Ax

Adaptive Experimentation Platform
Python
2,272
star
53

fbjs

A collection of utility libraries used by other Meta JS projects.
JavaScript
1,953
star
54

jsx

The JSX specification is a XML-like syntax extension to ECMAScript.
HTML
1,945
star
55

react-native-website

The React Native website and docs
JavaScript
1,899
star
56

screenshot-tests-for-android

Generate fast deterministic screenshots during Android instrumentation tests
Java
1,733
star
57

idx

Library for accessing arbitrarily nested, possibly nullable properties on a JavaScript object.
JavaScript
1,686
star
58

TextLayoutBuilder

An Android library that allows you to build text layouts more easily.
Java
1,470
star
59

mvfst

An implementation of the QUIC transport protocol.
C++
1,433
star
60

SoLoader

Native code loader for Android
Java
1,300
star
61

facebook-python-business-sdk

Python SDK for Meta Marketing APIs
Python
1,240
star
62

ThreatExchange

Trust & Safety tools for working together to fight digital harms.
C++
1,170
star
63

CacheLib

Pluggable in-process caching engine to build and scale high performance services
C++
1,097
star
64

mariana-trench

A security focused static analysis tool for Android and Java applications.
C++
1,041
star
65

fatal

Fatal is a library for fast prototyping software in modern C++. It provides facilities to enhance the expressive power of C++. The library is heavily based on template meta-programming, while keeping the complexity under-the-hood.
C++
1,000
star
66

transform360

Transform360 is an equirectangular to cubemap transform for 360 video.
C
996
star
67

openr

Distributed platform for building autonomic network functions.
C++
883
star
68

fboss

Facebook Open Switching System Software for controlling network switches.
C++
851
star
69

ktfmt

A program that reformats Kotlin source code to comply with the common community standard for Kotlin code conventions.
Kotlin
818
star
70

facebook-php-business-sdk

PHP SDK for Meta Marketing API
PHP
810
star
71

winterfell

A STARK prover and verifier for arbitrary computations
Rust
728
star
72

pyre2

Python wrapper for RE2
C++
631
star
73

starlark-rust

A Rust implementation of the Starlark language
Rust
623
star
74

openbmc

OpenBMC is an open software framework to build a complete Linux image for a Board Management Controller (BMC).
C
615
star
75

SPARTA

SPARTA is a library of software components specially designed for building high-performance static analyzers based on the theory of Abstract Interpretation.
C++
609
star
76

time

Meta's Time libraries
Go
570
star
77

chef-cookbooks

Open source chef cookbooks.
Ruby
565
star
78

IT-CPE

Meta's Client Platform Engineering tools. Some of the tools we have written to help manage our fleet of client systems.
Ruby
554
star
79

dotslash

Simplified executable deployment
Rust
523
star
80

Rapid

The OpenStreetMap editor driven by open data, AI, and supercharged features
JavaScript
515
star
81

lexical-ios

Lexical iOS is an extensible text editor framework that integrates the APIs and philosophies from Lexical Web with a Swift API built on top of TextKit.
Swift
477
star
82

facebook-sdk-for-unity

The facebook sdk for unity.
C#
474
star
83

facebook-nodejs-business-sdk

Node.js SDK for Meta Marketing APIs
JavaScript
469
star
84

facebook-java-business-sdk

Java SDK for Meta Marketing APIs
Java
379
star
85

chef-utils

Utilities related to Chef
Ruby
290
star
86

opaque-ke

An implementation of the OPAQUE password-authenticated key exchange protocol
Rust
275
star
87

dns

Collection of Meta's DNS Libraries
Go
257
star
88

facebook360_dep

Facebook360 Depth Estimation Pipeline - https://facebook.github.io/facebook360_dep
HTML
241
star
89

akd

An implementation of an auditable key directory
Rust
219
star
90

tac_plus

A Tacacs+ Daemon tested on Linux (CentOS) to run AAA via TACACS+ Protocol via IPv4 and IPv6.
C
207
star
91

facebook-ruby-business-sdk

Ruby SDK for Meta Marketing API
Ruby
204
star
92

usort

Safe, minimal import sorting for Python projects.
Python
171
star
93

grocery-delivery

The Grocery Delivery utility for managing cookbook uploads to distributed Chef backends.
Ruby
153
star
94

taste-tester

Software to manage a chef-zero instance and use it to test changes on production servers.
Ruby
146
star
95

TestSlide

A Python test framework
Python
143
star
96

sapp

Post Processor for Facebook Static Analysis Tools.
Python
127
star
97

homebrew-fb

OS X Homebrew formulas to install Meta open source software
Ruby
124
star
98

threat-research

Welcome to the Meta Threat Research Indicator Repository, a dedicated resource for the sharing of Indicators of Compromise (IOCs) and other threat indicators with the external research community
Python
124
star
99

ocamlrep

Sets of libraries and tools to write applications and libraries mixing OCaml and Rust. These libraries will help keeping your types and data structures synchronized, and enable seamless exchange between OCaml and Rust
Rust
121
star
100

squangle

SQuangLe is a C++ API for accessing MySQL servers
C++
121
star