• Stars
    star
    273
  • Rank 150,780 (Top 3 %)
  • Language
    C
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated 23 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A 256-RISC-V-core system with low-latency access into shared L1 memory.

ci lint License

MemPool

MemPool is a many-core system targeting image processing applications. It implements 256 RISC-V cores that can access a large, shared L1 memory in at most five cycles.

This repository contains the software and hardware of MemPool, as well as infrastructure for compilation and simulation.

Structure

The repository is structured as follows:

  • config contains the global configurations that are used by software as well as hardware.
  • hardware is where the RTL code and simulation scripts are.
  • scripts contains useful scripts such as linting or formatting scripts.
  • software provides example applications and MemPool's runtime.
  • toolchain holds third-party packages
    • halide is the compiler infrastructure for the Halide language.
    • llvm-project provides the LLVM compiler infrastructure.
    • riscv-gnu-toolchain contains the RISC-V GCC compiler.
    • riscv-isa-sim is an extended version of Spike and is used as the golden model and to parse simulation traces.
    • riscv-opcodes is an extended version of riscv-opcodes that contains our custom image processing extension.
    • verilator provides the open-source RTL simulator Verilator.

Get Started

Make sure you clone this repository recursively to get all the necessary submodules:

git submodule update --init --recursive

If the repository path of any submodule changes, run the following command to change your submodule's pointer to the remote repository:

git submodule sync --recursive

MemPool requires to patch a few hardware dependencies. To update the dependencies and apply the patches, run the following command after checking out in the project's root directory:

make update-deps

Build dependencies

Compiler

MemPool requires at least the RISC-V GCC toolchain to compile applications. It also supports LLVM, which depends on GCC. To implement image processing kernels, MemPool also supports Halide, a domain-specific language built on top of C++. Its compilation process is based on LLVM.

To build these toolchains, run the following commands in the project's root directory.

# Build both compilers (GCC and LLVM)
make toolchain
# Build only GCC
make tc-riscv-gcc
# Build only LLVM
make tc-llvm
# Build Halide
make halide

RTL Simulation

We use Bender to generate our simulation scripts. Make sure you have Bender installed, or install it in the MemPool repository with:

# Install Bender
make bender

The RTL simulation, or more specifically, the tracing in the simulation, relies on the SPIKE simulator. To build it, run the following command in the project's directory:

# Build Spike
make riscv-isa-sim

MemPool supports ModelSim and the open-source Verilator for RTL simulation. Use the following command to build and install Verilator:

# Build Verilator
make verilator

You will need an LLVM installation.

Software

Build Applications

The software/apps folder contains example applications that work on MemPool. MemPool also contains some Halide example applications in the software/halide folder and OpenMP applications in the software/omp folder. Run the following command to build an application. E.g., hello_world:

# Bare-metal applications
cd software/apps
make hello_world
# Halide applications
cd software/halide
make matmul
# OpenMP applications
cd software/omp
make omp_parallel

You can also choose the compiler to build the application with using the COMPILER option. The possible options are gcc and llvm, the former being the default.

# Compile with LLVM instead of GCC
make COMPILER=llvm hello_world

To run applications designed for the Xpulpimg extension, be sure to select the gcc compiler option. If all the Xpulpimg instructions implemented in Snitch at compilation time are supported by the Xpulpimg subset of the GCC compiler, you can build your application with the option XPULPIMG set to 1:

# Compile with GCC supporting Xpulpimg instruction set
make COMPILER=gcc XPULPIMG=1 hello_world

Otherwise, if new Xpulpimg instructions have been implemented in Snitch, but the Xpulpimg extension in the compiler does not support them yet, you must be sure to use Xpulpimg instructions only in an asm volatile construct within your C/C++ application, and set XPULPIMG=0. This will work as long as Xpulpimg is a subset of Xpulpv2.

If XPULPIMG is not forced while launching make, it will be defaulted to the xpulpimg value configured in config/config.mk. Note that such parameter in the configuration file also defines whether the Xpulpimg extension is enabled or not in the RTL of the Snitch core, and whether such Xpulpimg functionalities have to be tested or not by the riscv-tests unit tests.

Unit tests

The system is provided with an automatic unit tests suit for verification purposes; the tests are located in riscv-tests/isa, and can be launched from the top-level directory with:

make test

The unit tests will be compiled, simulated in Spike, and run in RTL simulation of MemPool. The compilation and simulation (for both Spike simulator and MemPool RTL) of the unit tests also depends on the xpulpimg parameter in config/config.mk: the test cases dedicated to the Xpulpimg instructions will be compiled and simulated only if xpulpimg=1. To add more tests, you must add your own ones to the riscv-isa infrastructure; more information can be found in software/riscv-tests/README.md.

The unit tests are included in the software package of software and can be compiled for MemPool by launching in the software directory:

make COMPILER=gcc test

Note that the unit tests need to be compiled with gcc. The same logic of normal applications concerning the XPULPIMG parameter applies for tests.

Writing Applications

MemPool follows LLVM's coding style guidelines when it comes to C and C++ code. We use clang-format to format all C code. Use make format in the project's root directory before committing software changes to make them conform with our style guide through clang-format.

RTL Simulation

To simulate the MemPool system with ModelSim, go to the hardware folder, which contains all the SystemVerilog files. Use the following command to run your simulation:

# Go to the hardware folder
cd hardware
# Only compile the hardware without running the simulation.
make compile
# Run the simulation with the *hello_world* binary loaded
app=hello_world make sim
# For Halide applications use the `halide-` prefix: E.g., to run `matmul`:
app=halide-matmul make sim
# Run the simulation with the *some_binary* binary. This allows specifying the full path to the binary
preload=/some_path/some_binary make sim
# Run the simulation without starting the gui
app=hello_world make simc
# Generate the human-readable traces after simulation is completed
make trace
# Generate a visualization of the traces
app=hello_world make tracevis
# Automatically run the benchmark (headless), extract the traces, and log the results
app=hello_world make benchmark

You can set up the configuration of the system in the file config/config.mk, controlling the total number of cores, the number of cores per tile and whether the Xpulpimg extension is enabled or not in the Snitch core; the xpulpimg parameter also control the default core architecture considered when compiling applications for MemPool.

To simulate the MemPool system with Verilator use the same format, but with the target

make verilate

If, during the Verilator model compilation, you run out of space on your disk, use

export OBJCACHE=''

to disable the use of ccache. Keep in mind that this will make the following compilations slower since compiled object files will no longer be cached.

If the tracer is enabled, its output traces are found under hardware/build, for both ModelSim and Verilator simulations.

Tracing can be controlled per core with a custom trace CSR register. The CSR is of type WARL and can only be set to zero or one. For debugging, tracing can be enabled persistently with the snitch_trace environment variable.

To get a visualization of the traces, check out the scripts/tracevis.py script. It creates a JSON file that can be viewed with Trace-Viewer or in Google Chrome by navigating to about:tracing.

We also provide Synopsys Spyglass linting scripts in the hardware/spyglass. Run make lint in the hardware folder, with a specific MemPool configuration, to run the tests associated with the lint_rtl target.

License

MemPool is released under permissive open source licenses. Most of MemPool's source code is released under the Apache License 2.0 (Apache-2.0) see LICENSE. The code in hardware is released under Solderpad v0.51 (SHL-0.51) see hardware/LICENSE.

Note, MemPool includes several third-party packages with their own licenses:

Software

  • software/runtime/printf.{c,h} is licensed under the MIT license.
  • software/runtime/omp/libgomp.h is licensed under the GPL license.
  • software/riscv-tests is an extended version of RISC-V's riscv-tests repository licensed under a BSD license. See software/riscv-tests/LICENSE for details.

Hardware

The hardware folder is licensed under Solderpad v0.51 see hardware/LICENSE. We use the following exceptions:

  • hardware/tb/dpi/elfloader.cpp is licensed under a BSD license.
  • hardware/tb/verilator/* is licensed under Apache License 2.0 see LICENSE
  • hardware/tb/verilator/lowrisc_* contain modified versions of lowRISC's helper libraries. They are licensed under Apache License 2.0.

Scripts

  • scripts/run_clang_format.py is licensed under the MIT license.

Toolchains

The following compilers can be used to build applications for MemPool:

  • toolchain/halide is licensed under the MIT license. See Halide's license for details.
  • toolchain/llvm-projectis licensed under the Apache License v2.0 with LLVM Exceptions. See LLVM's DeveloperPolicy for more details.
  • toolchain/riscv-gnu-toolchain's licensing information is available here

We use the following RISC-V tools to parse simulation traces and keep opcodes consistent throughout the project.

The open-source simulator Verilator can be used for RTL simulation.

Publication

If you want to use MemPool, you can cite us:

@InProceedings{MemPool2021,
    author    = {Matheus Cavalcante and Samuel Riedel and Antonio Pullini and Luca Benini},
    title     = {{MemPool}: A Shared-{L1} Memory Many-Core Cluster with a Low-Latency Interconnect},
    booktitle = {2021 Design, Automation, and Test in Europe Conference and Exhibition (DATE)},
    year      = 2021,
    month     = mar,
    address   = {Grenoble, FR},
    pages     = {701-706},
    doi       = {10.23919/DATE51398.2021.9474087}
}

This paper is also available at arXiv, at the following link: arXiv:2012.02973 [cs.AR].

More Repositories

1

axi

AXI SystemVerilog synthesizable IP modules and verification infrastructure for high-performance on-chip communication
SystemVerilog
1,007
star
2

pulpino

An open-source microcontroller system based on RISC-V
C
876
star
3

pulp-dronet

A deep learning-powered visual navigation engine to enables autonomous navigation of pocket-size quadrotor - running on PULP
C
491
star
4

pulpissimo

This is the top-level project for the PULPissimo Platform. It instantiates a PULPissimo open-source system with a PULP SoC domain, but no cluster.
SystemVerilog
381
star
5

ara

The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core
C
365
star
6

pulp

This is the top-level project for the PULP Platform. It instantiates a PULP open-source system with a PULP SoC (microcontroller) domain accelerated by a PULP cluster with 8 cores.
SystemVerilog
343
star
7

common_cells

Common SystemVerilog components
SystemVerilog
331
star
8

bender

A dependency management tool for hardware projects.
Rust
232
star
9

snitch

β›” DEPRECATED β›” Lean but mean RISC-V system!
SystemVerilog
218
star
10

cheshire

A minimal Linux-capable 64-bit RISC-V SoC built around CVA6
Verilog
194
star
11

riscv-dbg

RISC-V Debug Support for our PULP RISC-V Cores
SystemVerilog
183
star
12

FlooNoC

A Fast, Low-Overhead On-chip Network
SystemVerilog
131
star
13

pulp-sdk

C
99
star
14

hero

Heterogeneous Research Platform (HERO) for exploration of heterogeneous computers consisting of programmable many-core accelerators and an application-class host CPU, including full-stack software and hardware.
SystemVerilog
94
star
15

iDMA

A modular, parametrizable, and highly flexible Data Movement Accelerator (DMA)
SystemVerilog
89
star
16

pulp-nn

C
76
star
17

dory

A tool to deploy Deep Neural Networks on PULP-based SoC's
Python
76
star
18

carfield

A mixed-criticality platform built around Cheshire, with a number of safety/security and predictability features. Ready-to-use FPGA flow on multiple boards is available.
Tcl
72
star
19

pulp-riscv-gnu-toolchain

C
72
star
20

spatz

Spatz is a compact RISC-V-based vector processor meant for high-performance, small computing clusters.
C
71
star
21

register_interface

Generic Register Interface (contains various adapters)
SystemVerilog
68
star
22

pulp_soc

pulp_soc is the core building component of PULP based SoCs
SystemVerilog
63
star
23

morty

A SystemVerilog source file pickler.
Rust
51
star
24

snitch_cluster

An energy-efficient RISC-V floating-point compute cluster.
C
50
star
25

bigpulp

β›” DEPRECATED β›” RISC-V manycore accelerator for HERO, bigPULP hardware platform
SystemVerilog
50
star
26

axi_riscv_atomics

AXI Adapter(s) for RISC-V Atomic Operations
SystemVerilog
43
star
27

nemo

NEural Minimizer for pytOrch
Python
40
star
28

common_verification

SystemVerilog modules and classes commonly used for verification
SystemVerilog
39
star
29

pulp-runtime

Simple runtime for Pulp platforms
C
34
star
30

redmule

SystemVerilog
33
star
31

pulp-dsp

C
32
star
32

quantlab

Shell
32
star
33

RVfplib

Optimized RISC-V FP emulation for 32-bit processors
Assembly
31
star
34

pulp_cluster

The multi-core cluster of a PULP system.
SystemVerilog
31
star
35

fann-on-mcu

C
29
star
36

svase

C++
29
star
37

culsans

Tightly-coupled cache coherence unit for CVA6 using the ACE protocol
C
27
star
38

pulp-trainlib

Floating-Point Optimized On-Device Learning Library for the PULP Platform.
C
26
star
39

tech_cells_generic

Technology dependent cells instantiated in the design for generic process (simulation, FPGA)
SystemVerilog
25
star
40

clint

RISC-V Core Local Interrupt Controller (CLINT)
SystemVerilog
24
star
41

cheshire-ihp130-o

Tcl
24
star
42

stream-ebpc

Provides the hardware code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Lukas Cavigelli, Georg Rutishauser, Luca Benini.
SystemVerilog
23
star
43

axi_mem_if

Simple single-port AXI memory interface
SystemVerilog
23
star
44

uvm-components

Contains commonly used UVM components (agents, environments and tests).
SystemVerilog
22
star
45

hero-sdk

β›” DEPRECATED β›” HERO Software Development Kit
Shell
21
star
46

ri5cy_gnu_toolchain

Makefile
21
star
47

jtag_dpi

JTAG DPI module for SystemVerilog RTL simulations
SystemVerilog
21
star
48

fpu

SystemVerilog
21
star
49

axi_llc

SystemVerilog
20
star
50

neureka

2-8bit weights, 8-bit activations flexible Neural Processing Engine for PULP clusters
SystemVerilog
19
star
51

hyperbus

SystemVerilog
18
star
52

axi_spi_slave

SystemVerilog
18
star
53

quantlib

A library to train and deploy quantised Deep Neural Networks
Python
18
star
54

clic

RISC-V fast interrupt controller
SystemVerilog
18
star
55

axi_node

AXI X-Bar
SystemVerilog
17
star
56

serial_link

A simple, scalable, source-synchronous, all-digital DDR link
SystemVerilog
17
star
57

croc

A PULP SoC for education, easy to understand and extend with a full flow for a physical design.
SystemVerilog
16
star
58

banshee

Rust
16
star
59

occamy

A high-efficiency system-on-chip for floating-point compute workloads.
Python
16
star
60

rbe

Reconfigurable Binary Engine
SystemVerilog
15
star
61

sne

SystemVerilog
15
star
62

gvsoc

Pulp virtual platform
C++
15
star
63

axi_spi_master

SystemVerilog
15
star
64

hwpe-stream

IPs for data-plane integration of Hardware Processing Engines (HWPEs) within a PULP system
SystemVerilog
14
star
65

Deeploy

ONNX-to-C Compiler for Heterogeneous SoCs
Python
14
star
66

fpu_div_sqrt_mvp

[UNRELEASED] FP div/sqrt unit for transprecision
SystemVerilog
13
star
67

ne16

Neural Engine, 16 input channels
SystemVerilog
13
star
68

trace_debugger

Capture retired instructions of a RISC-V Core and compress them to a sequence of packets.
SystemVerilog
12
star
69

axi2apb

SystemVerilog
12
star
70

mibench

The MiBench testsuite, extended for use in general embedded environments
C
12
star
71

adv_dbg_if

Advanced Debug Interface
SystemVerilog
12
star
72

hci

Heterogeneous Cluster Interconnect to bind special-purpose HW accelerators with general-purpose cluster cores
SystemVerilog
12
star
73

trdb

RISC-V processor tracing tools and library
C
12
star
74

pulp-nn-mixed

C
11
star
75

pulp-freertos

FreeRTOS for PULP
C
11
star
76

ecg-tcn

Official code for ECG-TCN paper accepted for publication on AICAS2021
Python
11
star
77

safety_island

A reliable, real-time subsystem for the Carfield SoC
C
11
star
78

ELAU

SystemVerilog
10
star
79

jtag_pulp

Verilog
10
star
80

AI-deck-workshop

Assembly
10
star
81

pulp-debug-bridge

Tool to connect the workstation to the pulp targets abd interact with them
C++
10
star
82

hier-icache

SystemVerilog
10
star
83

quadrilatero

matrix-coprocessor for RISC-V
C
10
star
84

pulp-detector

C
10
star
85

chimera

Python
9
star
86

riscv-gnu-toolchain

GNU toolchain for PULP and RISC-V
C
9
star
87

gpio

Parametric GPIO Peripheral
SystemVerilog
9
star
88

cluster_interconnect

SystemVerilog
9
star
89

hwpe-mac-engine

An example Hardware Processing Engine
SystemVerilog
9
star
90

obi

OBI SystemVerilog synthesizable interconnect IPs for on-chip communication
SystemVerilog
9
star
91

ITA

SystemVerilog
9
star
92

pulp-rt-examples

C
8
star
93

fpu_ss

CORE-V eXtension Interface compliant RISC-V [F|Zfinx] Coprocessor
SystemVerilog
8
star
94

pulp-builder

Shell
8
star
95

apb_timer

APB Timer Unit
SystemVerilog
8
star
96

pulp-transformer

C
8
star
97

redundancy_cells

SystemVerilog IPs and Modules for architectural redundancy designs.
SystemVerilog
8
star
98

pulp-ethernet

SystemVerilog
8
star
99

dram_rtl_sim

SystemVerilog
8
star
100

pulp-actions

Python
7
star