• Stars
    star
    2,539
  • Rank 17,375 (Top 0.4 %)
  • Language
    C++
  • License
    MIT License
  • Created about 7 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

âš  Deprecation Notice 24.01 announcement: NCHW data format specific optimizations will gradually be removed from the code base in future releases. The implication of this is that the user is expected to translate NCHW models into NHWC in order to benefit from the optimizations.




Compute Library

The Compute Library is a collection of low-level machine learning functions optimized for Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures.

The library provides superior performance to other open source alternatives and immediate support for new Arm® technologies e.g. SVE2.

Key Features:

  • Open source software available under a permissive MIT license
  • Over 100 machine learning functions for CPU and GPU
  • Multiple convolution algorithms (GeMM, Winograd, FFT, Direct and indirect-GeMM)
  • Support for multiple data types: FP32, FP16, INT8, UINT8, BFLOAT16
  • Micro-architecture optimization for key ML primitives
  • Highly configurable build options enabling lightweight binaries
  • Advanced optimization techniques such as kernel fusion, Fast math enablement and texture utilization
  • Device and workload specific tuning using OpenCL tuner and GeMM optimized heuristics

Repository Link
Release https://github.com/arm-software/ComputeLibrary
Development https://review.mlplatform.org/#/admin/projects/ml/ComputeLibrary

Documentation

Documentation

Note: The documentation includes the reference API, changelogs, build guide, contribution guide, errata, etc.


Pre-built binaries

All the binaries can be downloaded from here or from the tables below.


Platform Operating System Release archive (Download)
Raspberry Pi 4 Linux® 32bit
Raspberry Pi 4 Linux® 64bit
Odroid N2 Linux® 64bit
HiKey960 Linux® 64bit

Architecture Operating System Release archive (Download)
armv7 Linux®
arm64-v8a Androidâ„¢
arm64-v8a Linux®
arm64-v8.2-a Androidâ„¢
arm64-v8.2-a Linux®

Please refer to the following link for more pre-built binaries:

Pre-build binaries are generated with the following security / good coding practices related flags:

-Wall, -Wextra, -Wformat=2, -Winit-self, -Wstrict-overflow=2, -Wswitch-default, -Woverloaded-virtual, -Wformat-security, -Wctor-dtor-privacy, -Wsign-promo, -Weffc++, -pedantic, -fstack-protector-strong

Supported Architectures/Technologies

  • Arm® CPUs:

    • Arm® Cortex®-A processor family using Arm® Neonâ„¢ technology
    • Arm® Neoverse® processor family
    • Arm® Cortex®-R processor family with Armv8-R AArch64 architecture using Arm® Neonâ„¢ technology
    • Arm® Cortex®-X1 processor using Arm® Neonâ„¢ technology
  • Arm® Maliâ„¢ GPUs:

    • Arm® Maliâ„¢-G processor family
    • Arm® Maliâ„¢-T processor family
  • x86


Supported Systems

  • Androidâ„¢
  • Bare Metal
  • Linux®
  • OpenBSD®
  • macOS®
  • Tizenâ„¢

Resources


Experimental builds

âš  Important Bazel and CMake builds are experimental CPU only builds, please see the documentation for more details.


How to contribute

Contributions to the Compute Library are more than welcome. If you are interested on contributing, please have a look at our how to contribute guidelines.

Developer Certificate of Origin (DCO)

Before the Compute Library accepts your contribution, you need to certify its origin and give us your permission. To manage this process we use the Developer Certificate of Origin (DCO) V1.1 (https://developercertificate.org/)

To indicate that you agree to the the terms of the DCO, you "sign off" your contribution by adding a line with your name and e-mail address to every git commit message:

Signed-off-by: John Doe <[email protected]>

You must use your real name, no pseudonyms or anonymous contributions are accepted.

Public mailing list

For technical discussion, the ComputeLibrary project has a public mailing list: [email protected] The list is open to anyone inside or outside of Arm to self subscribe. In order to subscribe, please visit the following website: https://lists.linaro.org/mailman3/lists/acl-dev.lists.linaro.org/


License and Contributions

The software is provided under MIT license. Contributions to this project are accepted under the same license.

Other Projects

This project contains code from other projects as listed below. The original license text is included in those source files.

  • The OpenCL header library is licensed under Apache License, Version 2.0, which is a permissive license compatible with MIT license.

  • The half library is licensed under MIT license.

  • The libnpy library is licensed under MIT license.

  • The stb image library is either licensed under MIT license or is in Public Domain. It is used by this project under the terms of MIT license.


Trademarks and Copyrights

Android is a trademark of Google LLC.

Arm, Cortex, Mali and Neon are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere.

Bazel is a trademark of Google LLC., registered in the U.S. and other countries.

CMake is a trademark of Kitware, Inc., registered in the U.S. and other countries.

Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.

Mac and macOS are trademarks of Apple Inc., registered in the U.S. and other countries.

Tizen is a registered trademark of The Linux Foundation.

Windows® is a trademark of the Microsoft group of companies.

More Repositories

1

arm-trusted-firmware

Read-only mirror of Trusted Firmware-A
C
1,690
star
2

CMSIS_5

CMSIS Version 5 Development Repository
C
1,233
star
3

armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
C++
1,104
star
4

ML-KWS-for-MCU

Keyword spotting on Arm Cortex-M Microcontrollers
C
1,040
star
5

astc-encoder

The Arm ASTC Encoder, a compressor for the Adaptive Scalable Texture Compression data format.
C
880
star
6

abi-aa

Application Binary Interface for the Arm® Architecture
HTML
673
star
7

vulkan_best_practice_for_mobile_developers

Vulkan best practice for mobile developers
C++
564
star
8

optimized-routines

Optimized implementations of various library functions for ARM architecture processors
C
486
star
9

CMSIS-FreeRTOS

FreeRTOS adaptation for CMSIS-RTOS Version 2
C
460
star
10

CMSIS_4

Cortex Microcontroller Software Interface Standard (V4 no longer maintained)
C
437
star
11

ML-examples

Arm Machine Learning tutorials and examples
C++
371
star
12

LLVM-embedded-toolchain-for-Arm

A project dedicated to building LLVM toolchain for 32-bit Arm embedded targets.
CMake
331
star
13

opengl-es-sdk-for-android

OpenGL ES SDK for Android
CSS
325
star
14

mango

Parallel Hyperparameter Tuning in Python
Jupyter Notebook
311
star
15

SCALE-Sim

Python
296
star
16

CMSIS-DSP

CMSIS-DSP embedded compute library for Cortex-M and Cortex-A
C
277
star
17

Arm-2D

2D Graphic Library optimized for Cortex-M processors
C
232
star
18

Tool-Solutions

Tutorials & examples for Arm software development tools.
C
217
star
19

EndpointAI

C++
216
star
20

SCP-firmware

Read-only mirror of System Control Processor (SCP) firmware
C
205
star
21

vulkan-sdk

Github repository for the Vulkan SDK
C
199
star
22

lisa

Linux Integrated System Analysis
Jupyter Notebook
192
star
23

HWCPipe

Hardware counters interface
C++
188
star
24

u-boot

Clone of upstream U-Boot repo with patches for Arm development boards
C
169
star
25

ML-zoo

Python
149
star
26

android-nn-driver

C++
149
star
27

CMSIS-NN

CMSIS-NN Library
C
148
star
28

CMSIS-Driver

Repository of microcontroller peripheral driver implementing the CMSIS-Driver API specification
C
144
star
29

workload-automation

A framework for automating workload execution and measurement collection on ARM devices.
Python
138
star
30

gator

Sources for Arm Streamline's gator daemon
C++
121
star
31

perfdoc

A cross-platform Vulkan layer which checks Vulkan applications for best practices on Arm Mali devices.
C++
112
star
32

ebbr

Embedded Base Boot Requirements Specification
PostScript
108
star
33

keyword-transformer

Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769
Jupyter Notebook
108
star
34

linux

C
95
star
35

asl-interpreter

Example implementation of Arm's Architecture Specification Language (ASL)
OCaml
94
star
36

mobile-studio-integration-for-unity

Mobile Studio tool integration with C# scripting for the Unity game engine.
C
86
star
37

sbsa-acs

ARM Enterprise: SBSA Architecture Compliance Suite
C
84
star
38

CMSIS_6

CMSIS version 6 (successor of CMSIS_5)
C
83
star
39

sesr

Super-Efficient Super Resolution
Python
80
star
40

CSAL

Coresight Access Library
C
78
star
41

progress64

PROGRESS64 is a C library of scalable functions for concurrent programs, primarily focused on networking applications.
C
65
star
42

Cloud-IoT-Core-Kit-Examples

Example projects and code are supplied to support the Arm-based IoT Kit for Cloud IoT Core
Python
61
star
43

trappy

This repository has moved to https://gitlab.arm.com/tooling/trappy
Python
61
star
44

psa-arch-tests

Tests for verifying implementations of TBSA-v8M and the PSA Certified APIs
C
61
star
45

cmsis-pack-eclipse

CMSIS-Pack Eclipse Plug-ins
Java
60
star
46

developer

GTM related documentation
C++
59
star
47

AVH-GetStarted

C
56
star
48

ethos-n-driver-stack

Driver stack (including user space libraries, kernel module and firmware) for the Arm® Ethos™-N NPU
C++
53
star
49

acle

Arm C Language Extensions (ACLE)
Python
52
star
50

patrace

C++
52
star
51

tarmac-trace-utilities

Tools for analyzing and browsing Tarmac instruction traces.
C++
47
star
52

devlib

Library for interaction with and instrumentation of remote devices.
Python
44
star
53

speculation-barrier

This project provides a header file which contains wrapper macros for the __builtin_load_no_speculate builtin function defined at https://www.arm.com/security-update This builtin function defines a speculation barrier, which can be used to limit the conditions under which a value which has been loaded can be used under speculative execution.
Objective-C
44
star
54

arm-systemready

Arm SystemReady
Shell
43
star
55

arm-enterprise-acs

ARM Enterprise ACS
C
39
star
56

DeepFreeze

SystemVerilog
38
star
57

tf-issues

Issue tracking for the ARM Trusted Firmware project
36
star
58

scalpel

This is a PyTorch implementation of the Scalpel. Node pruning for five benchmark networks and SIMD-aware weight pruning for LeNet-300-100 and LeNet-5 is included.
Python
35
star
59

psa-api

Documentation source and development of the PSA Certified API
C
34
star
60

perf-libs-tools

C
31
star
61

bob-build

Meta-build system using Blueprint and ninja
Go
30
star
62

CMSIS-RTX

RTX5 real time kernel for Arm Cortex-based embedded systems (spin-off from CMSIS_5)
C
29
star
63

TZ-TRNG

TrustZone True Number Generator
C
29
star
64

mram_simulation_framework

MRAM magnetization simulation framework. s-LLGS python and verilog-a solvers for transients simulation and Fokker-planck equation solver for stochastic analysis
Python
28
star
65

AVH

AVH FVPs: Arm Virtual Hardware with Fixed Virtual Platforms
C
27
star
66

bento-linker

A light-weight alternative to processes for microcontrollers.
C
27
star
67

data

Machine-readable data describing Arm architecture and implementations. Includes JSON descriptions of implemented PMU events.
26
star
68

synchronization-benchmarks

Collection of synchronization micro-benchmarks and traces from infrastructure applications
C
26
star
69

libGPUInfo

A utility library for application developers to query the configuration of the Arm Immortalis GPU or Arm Mali GPU present in their system.
C++
24
star
70

toolchain-gnu-bare-metal

A toolchain sub-project dedicated to build GNU toolchain for 32-bit bare-metal targets
Shell
24
star
71

NXP_LPC

CMSIS Driver Implementations for the NXP LPC Microcontroller Series
C
23
star
72

golang-utils

Helpers and utilities for Golang in order to do actions not available in the standard library.
Go
23
star
73

libddssec

DDS Security library - Project moved to https://gitlab.arm.com/libraries/libddssec
C
23
star
74

AArch64cryptolib

AArch64cryptolib is a from scratch implementation of cryptographic primitives aiming for optimal performance on Arm A-class cores
C
23
star
75

cryptocell-312-runtime

CryptoCell 312 runtime code
C
22
star
76

Shackleton-Framework

A generic genetic programming framework that aims to make genetic programming easier for a myriad of uses. Currently, the main target is to use the framework for code optimization in tandem with the LLVM framework.
C
22
star
77

CMSIS-View

Repository of CMSIS Software Pack for software event generation and input/output handling.
Go
22
star
78

AVH-TFLmicrospeech

Example: Micro speech for TensorFlow Lite
C
21
star
79

PAF

PAF (the Physical Attack Framework) is a framework for analyzing physical attacks: fault injection and side channels
C++
20
star
80

bart

Behavioural Analysis and Regression Toolkit
Python
20
star
81

HPCG_for_Arm

C++
20
star
82

armnn-mlperf

Arm mlperf.org benchmark port
C++
20
star
83

coresight-wire-protocol

Coresight Wire Protocol (CSWP) Server/Client and streaming trace examples.
HTML
18
star
84

ATP-Engine

C++
18
star
85

ATS-Keyword

Smart Home Total Solution - Keyword Recognition
C
17
star
86

vscode-cmsis-csolution

Extension support for VS Code CMSIS Project Extension
17
star
87

vscode-keil-studio-pack

Extension pack for all VS Code extensions
16
star
88

vr-sdk-for-android

VR SDK for Android
CSS
16
star
89

meabo

Multi-purpose multi-phase micro-benchmark
C
15
star
90

avhclient

Arm Virtual Hardware Client
Python
15
star
91

CMSIS-Compiler

CMSIS Compiler support for Arm Compiler
C
15
star
92

Methodology_for_ArmIE_SVE

C++
15
star
93

vktrace-arm

Vktrace arm fork
C++
14
star
94

bsa-acs

Arm SystemReady : BSA Architecture Compliance Suite
C
14
star
95

open-iot-sdk

Open-IoT-SDK - Home of the Total Solution applications.
C
14
star
96

CMSIS-RTOS2_Validation

Validation test suite for CMSIS-RTOS2 API implementations using Arm Virtual Hardware (AVH).
C
14
star
97

CMSIS-Driver_Validation

Test suite for verifying CMSIS-Driver implementations.
C
13
star
98

drm-hwcomposer

Clone of the drm-hwcomposer from freedesktop.org plus patches from Mali DP team
C++
12
star
99

nomali-model

A simple Mali 6xx/7xx register interface model that doesn't do any rendering.
C++
11
star
100

Cortex_DFP

CMSIS generic Arm Cortex-M device family pack
C
11
star