• Stars
    star
    148
  • Rank 242,182 (Top 5 %)
  • Language
    C
  • License
    Apache License 2.0
  • Created almost 2 years ago
  • Updated 19 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CMSIS-NN Library

CMSIS NN

CMSIS NN software library is a collection of efficient neural network kernels developed to maximize the performance and minimize the memory footprint of neural networks on Arm Cortex-M processors.

Supported Framework

The library follows the int8 and int16 quantization specification of TensorFlow Lite for Microcontrollers.

Branches and Tags

There is a single branch called 'main'. Tags are created during a release. Two releases are planned to be done in a year. The releases can be found here .

Current Operator Support

In general optimizations are written for an architecture feature. This falls into one of the following categories. Based on feature flags for a processor or architecture provided to the compiler, the right implementation is picked.

Pure C

There is always a pure C implementation for an operator. This is used for processors like Arm Cortex-M0 or Cortex-M3.

DSP Extension

Processors with DSP extension uses Single Instruction Multiple Data(SIMD) instructions for optimization. Examples of processors here are Cortex-M4 or a Cortex-M33 configured with optional DSP extension.

MVE Extension

Processors with Arm Helium Technology use the Arm M-profile Vector Extension(MVE) instructions for optimization. Examples are Cortex-M55 or Cortex-M85 configured with MVE.

Operator C
int8
C
int16
C
int4*
DSP
int8
DSP
int16
DSP
int4*
MVE
int8
MVE
int16
Conv2D Yes Yes Yes Yes Yes Yes Yes Yes
DepthwiseConv2D Yes Yes Yes Yes Yes Yes Yes Yes
TransposeConv2D Yes No No Yes No No Yes No
Fully Connected Yes Yes Yes Yes Yes Yes Yes Yes
Add Yes Yes N/A Yes Yes N/A Yes Yes
Mul Yes Yes N/A Yes Yes N/A Yes Yes
MaxPooling Yes Yes N/A Yes Yes N/A Yes Yes
AvgPooling Yes Yes N/A Yes Yes N/A Yes Yes
Softmax Yes Yes N/A Yes Yes N/A Yes No
LSTM Yes NA No Yes NA No Yes NA
SVDF Yes No No Yes No No Yes No
  • int4 weights + int8 activations

Contribution Guideline

First, a thank you for the contribution. Here are some guidelines and good to know information to get started.

Coding Guideline

By default, follow the style used in the file. You'll soon start noticing a pattern like

  • Variable and function names are lower case with an underscore separator.
  • Hungarian notation is not used. Well, almost.
  • If the variable names don't convey the action, then add comments.

New Files

One function per file is followed in most places. In those cases, the file name must match the function name. Connect the function to an appropriate Doxygen group as well.

Doxygen

Function prototypes must have a detailed comment header in Doxygen format. You can execute the doxygen document generation script in the Documentation/Doxygen folder to check that no errors are introduced.

Unit Tests

For any new features and bug fixes, new unit tests are needed. Improvements have to be verifed by unit tests. If you do not have the means to execute the tests, you can still make the PR and comment that you need help in completing/executing the unit tests.

Version & Date

Each File has a version number and a date field that must be updated when making any change to that file. The versioning follows Semantic Versioning 2.0.0 format. For details check: https://semver.org/

Building CMSIS-NN as a library

It is recommended to use toolchain files from Arm Ethos-U Core Platform project. These are supporting TARGET_CPU, which is a required argument. Note that if not specifying TARGET_CPU, these toolchains will set some default. The format must be TARGET_CPU=cortex-mXX, see examples below.

Here is an example:

cd </path/to/CMSIS_NN>
mkdir build
cd build
cmake .. -DCMAKE_TOOLCHAIN_FILE=</path/to/ethos-u-core-platform>/cmake/toolchain/arm-none-eabi-gcc.cmake -DTARGET_CPU=cortex-m55
make

Some more examples:

cmake .. -DCMAKE_TOOLCHAIN_FILE=</path/to/ethos-u-core-platform>/cmake/toolchain/armclang.cmake -DTARGET_CPU=cortex-m55
cmake .. -DCMAKE_TOOLCHAIN_FILE=</path/to/ethos-u-core-platform>/cmake/toolchain/arm-none-eabi-gcc.cmake -DTARGET_CPU=cortex-m7
cmake .. -DCMAKE_TOOLCHAIN_FILE=</path/to/ethos-u-core-platform>/cmake/toolchain/armclang.cmake -DTARGET_CPU=cortex-m3

Compiler Options

Default optimization level is set at Ofast. This can be overwritten with CMake on command line by using "-DCMSIS_OPTIMIZATION_LEVEL". Please change according to project needs. Just bear in mind this can impact performance. With only optimization level -O0, ARM_MATH_AUTOVECTORIZE needs to be defined for processors with Helium Technology.

The compiler option '-fomit-frame-pointer' is enabled by default at -O and higher. When no optimization level is specified, you may need to specify '-fomit-frame-pointer'.

The compiler option '-fno-builtin' does not utilize optimized implementations of e.g. memcpy and memset, which are heavily used by CMSIS-NN. It can significantly downgrade performance. So this should be avoided. The compiler option '-ffreestanding' should also be avoided as it enables '-fno-builtin' implicitly.

Supported Compilers

  • CMSIS-NN is tested on Arm Compiler 6 and on Arm GNU Toolchain.
  • IAR compiler is not tested and there can be compilation and/or performance issues.
  • Compilation for Host is not supported out of the box. It should be possible to use the C implementation and compile for host with minor stubbing effort.

Inclusive Language

This product confirms to Armโ€™s inclusive language policy and, to the best of our knowledge, does not contain any non-inclusive language. If you find something that concerns you, email [email protected].

Support / Contact

For any questions or to reach the CMSIS-NN team, please create a new issue in https://github.com/ARM-software/CMSIS-NN/issues

More Repositories

1

ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
C++
2,539
star
2

arm-trusted-firmware

Read-only mirror of Trusted Firmware-A
C
1,690
star
3

CMSIS_5

CMSIS Version 5 Development Repository
C
1,251
star
4

armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
C++
1,104
star
5

ML-KWS-for-MCU

Keyword spotting on Arm Cortex-M Microcontrollers
C
1,040
star
6

astc-encoder

The Arm ASTC Encoder, a compressor for the Adaptive Scalable Texture Compression data format.
C
880
star
7

abi-aa

Application Binary Interface for the Armยฎ Architecture
HTML
673
star
8

vulkan_best_practice_for_mobile_developers

Vulkan best practice for mobile developers
C++
564
star
9

optimized-routines

Optimized implementations of various library functions for ARM architecture processors
C
486
star
10

CMSIS-FreeRTOS

FreeRTOS adaptation for CMSIS-RTOS Version 2
C
460
star
11

CMSIS_4

Cortex Microcontroller Software Interface Standard (V4 no longer maintained)
C
437
star
12

ML-examples

Arm Machine Learning tutorials and examples
C++
371
star
13

LLVM-embedded-toolchain-for-Arm

A project dedicated to building LLVM toolchain for 32-bit Arm embedded targets.
CMake
331
star
14

opengl-es-sdk-for-android

OpenGL ES SDK for Android
CSS
325
star
15

mango

Parallel Hyperparameter Tuning in Python
Jupyter Notebook
311
star
16

SCALE-Sim

Python
296
star
17

CMSIS-DSP

CMSIS-DSP embedded compute library for Cortex-M and Cortex-A
C
277
star
18

Arm-2D

2D Graphic Library optimized for Cortex-M processors
C
232
star
19

Tool-Solutions

Tutorials & examples for Arm software development tools.
C
217
star
20

EndpointAI

C++
216
star
21

SCP-firmware

Read-only mirror of System Control Processor (SCP) firmware
C
205
star
22

vulkan-sdk

Github repository for the Vulkan SDK
C
199
star
23

lisa

Linux Integrated System Analysis
Jupyter Notebook
192
star
24

HWCPipe

Hardware counters interface
C++
188
star
25

u-boot

Clone of upstream U-Boot repo with patches for Arm development boards
C
175
star
26

CMSIS-Driver

Repository of microcontroller peripheral driver implementing the CMSIS-Driver API specification
C
150
star
27

ML-zoo

Python
149
star
28

android-nn-driver

C++
149
star
29

workload-automation

A framework for automating workload execution and measurement collection on ARM devices.
Python
138
star
30

gator

Sources for Arm Streamline's gator daemon
C++
121
star
31

keyword-transformer

Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769
Jupyter Notebook
116
star
32

perfdoc

A cross-platform Vulkan layer which checks Vulkan applications for best practices on Arm Mali devices.
C++
112
star
33

ebbr

Embedded Base Boot Requirements Specification
PostScript
111
star
34

CMSIS_6

CMSIS version 6 (successor of CMSIS_5)
C
106
star
35

linux

C
95
star
36

asl-interpreter

Example implementation of Arm's Architecture Specification Language (ASL)
OCaml
94
star
37

mobile-studio-integration-for-unity

Mobile Studio tool integration with C# scripting for the Unity game engine.
C
86
star
38

sbsa-acs

ARM Enterprise: SBSA Architecture Compliance Suite
C
84
star
39

sesr

Super-Efficient Super Resolution
Python
80
star
40

CSAL

Coresight Access Library
C
78
star
41

progress64

PROGRESS64 is a C library of scalable functions for concurrent programs, primarily focused on networking applications.
C
65
star
42

developer

GTM related documentation
C++
61
star
43

Cloud-IoT-Core-Kit-Examples

Example projects and code are supplied to support the Arm-based IoT Kit for Cloud IoT Core
Python
61
star
44

trappy

This repository has moved to https://gitlab.arm.com/tooling/trappy
Python
61
star
45

psa-arch-tests

Tests for verifying implementations of TBSA-v8M and the PSA Certified APIs
C
61
star
46

cmsis-pack-eclipse

CMSIS-Pack Eclipse Plug-ins
Java
60
star
47

AVH-GetStarted

C
56
star
48

ethos-n-driver-stack

Driver stack (including user space libraries, kernel module and firmware) for the Armยฎ Ethosโ„ข-N NPU
C++
55
star
49

acle

Arm C Language Extensions (ACLE)
Python
52
star
50

patrace

C++
52
star
51

tarmac-trace-utilities

Tools for analyzing and browsing Tarmac instruction traces.
C++
47
star
52

devlib

Library for interaction with and instrumentation of remote devices.
Python
44
star
53

speculation-barrier

This project provides a header file which contains wrapper macros for the __builtin_load_no_speculate builtin function defined at https://www.arm.com/security-update This builtin function defines a speculation barrier, which can be used to limit the conditions under which a value which has been loaded can be used under speculative execution.
Objective-C
44
star
54

arm-systemready

Arm SystemReady
Shell
43
star
55

arm-enterprise-acs

ARM Enterprise ACS
C
39
star
56

DeepFreeze

SystemVerilog
38
star
57

tf-issues

Issue tracking for the ARM Trusted Firmware project
36
star
58

scalpel

This is a PyTorch implementation of the Scalpel. Node pruning for five benchmark networks and SIMD-aware weight pruning for LeNet-300-100 and LeNet-5 is included.
Python
35
star
59

psa-api

Documentation source and development of the PSA Certified API
C
34
star
60

CMSIS-RTX

RTX5 real time kernel for Arm Cortex-based embedded systems (spin-off from CMSIS_5)
C
33
star
61

perf-libs-tools

C
31
star
62

bob-build

Meta-build system using Blueprint and ninja
Go
30
star
63

TZ-TRNG

TrustZone True Number Generator
C
29
star
64

mram_simulation_framework

MRAM magnetization simulation framework. s-LLGS python and verilog-a solvers for transients simulation and Fokker-planck equation solver for stochastic analysis
Python
28
star
65

AVH

AVH FVPs: Arm Virtual Hardware with Fixed Virtual Platforms
C
27
star
66

bento-linker

A light-weight alternative to processes for microcontrollers.
C
27
star
67

data

Machine-readable data describing Arm architecture and implementations. Includes JSON descriptions of implemented PMU events.
26
star
68

synchronization-benchmarks

Collection of synchronization micro-benchmarks and traces from infrastructure applications
C
26
star
69

libGPUInfo

A utility library for application developers to query the configuration of the Arm Immortalis GPU or Arm Mali GPU present in their system.
C++
24
star
70

toolchain-gnu-bare-metal

A toolchain sub-project dedicated to build GNU toolchain for 32-bit bare-metal targets
Shell
24
star
71

NXP_LPC

CMSIS Driver Implementations for the NXP LPC Microcontroller Series
C
23
star
72

golang-utils

Helpers and utilities for Golang in order to do actions not available in the standard library.
Go
23
star
73

libddssec

DDS Security library - Project moved to https://gitlab.arm.com/libraries/libddssec
C
23
star
74

AArch64cryptolib

AArch64cryptolib is a from scratch implementation of cryptographic primitives aiming for optimal performance on Arm A-class cores
C
23
star
75

cryptocell-312-runtime

CryptoCell 312 runtime code
C
22
star
76

Shackleton-Framework

A generic genetic programming framework that aims to make genetic programming easier for a myriad of uses. Currently, the main target is to use the framework for code optimization in tandem with the LLVM framework.
C
22
star
77

CMSIS-View

Repository of CMSIS Software Pack for software event generation and input/output handling.
Go
22
star
78

AVH-TFLmicrospeech

Example: Micro speech for TensorFlow Lite
C
21
star
79

PAF

PAF (the Physical Attack Framework) is a framework for analyzing physical attacks: fault injection and side channels
C++
20
star
80

bart

Behavioural Analysis and Regression Toolkit
Python
20
star
81

HPCG_for_Arm

C++
20
star
82

armnn-mlperf

Arm mlperf.org benchmark port
C++
20
star
83

coresight-wire-protocol

Coresight Wire Protocol (CSWP) Server/Client and streaming trace examples.
HTML
18
star
84

ATP-Engine

C++
18
star
85

CMSIS-DAP

CoreSight Debug Access Port (DAP) debug probe protocol reference implementation (spin-off from CMSIS_5)
C
18
star
86

ATS-Keyword

Smart Home Total Solution - Keyword Recognition
C
17
star
87

vscode-cmsis-csolution

Extension support for VS Code CMSIS Project Extension
17
star
88

vscode-keil-studio-pack

Extension pack for all VS Code extensions
16
star
89

vr-sdk-for-android

VR SDK for Android
CSS
16
star
90

meabo

Multi-purpose multi-phase micro-benchmark
C
15
star
91

avhclient

Arm Virtual Hardware Client
Python
15
star
92

CMSIS-Compiler

CMSIS Compiler support for Arm Compiler
C
15
star
93

Methodology_for_ArmIE_SVE

C++
15
star
94

vktrace-arm

Vktrace arm fork
C++
14
star
95

bsa-acs

Arm SystemReady : BSA Architecture Compliance Suite
C
14
star
96

open-iot-sdk

Open-IoT-SDK - Home of the Total Solution applications.
C
14
star
97

CMSIS-RTOS2_Validation

Validation test suite for CMSIS-RTOS2 API implementations using Arm Virtual Hardware (AVH).
C
14
star
98

CMSIS-Stream

CMSIS-Stream software component
Python
14
star
99

CMSIS-Driver_Validation

Test suite for verifying CMSIS-Driver implementations.
C
13
star
100

Cortex_DFP

CMSIS generic Arm Cortex-M device family pack
C
12
star