• Stars
    star
    250
  • Rank 162,397 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    GNU Affero Genera...
  • Created over 6 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

FPGA-based neural network inference project with an end-to-end approach (from training to implementation to deployment)

spooNN

picture

This is a repository for FPGA-based neural network inference, that delivered the highest FPS in the international contest for object detection as part of Design Automation Conference 2018 and 2019 (https://www.dac.com/content/2018-system-design-contest). The contents of spooNN enable an end-to-end capability to perform inference on FPGAs; starting from training scripts using Tensorflow to deployment on hardware. Target hardware platforms are PYNQ (http://www.pynq.io/) and ULTRA96 (https://www.96boards.org/product/ultra96/).

picture 2018: The final rankings are published at http://www.cse.cuhk.edu.hk/~byu/2018-DAC-SDC/index.html

picture 2019: The final rankings are published at http://www.cse.cuhk.edu.hk/~byu/2019-DAC-SDC/index.html

Repo organization

  • hls-nn-lib: A neural network inference library implemented in C for Vivado High Level Synthesis (HLS).
  • mnist-cnn: helloworld project, showing an end-to-end flow (training, implementation, FPGA deployment) for MNIST handwritted digit classification with a convolutional neural network.
  • halfsqueezenet (targets PYNQ): The object detection network, that ranked second in DAC 2018 contest, delivering the highest FPS at lowest power consumption for object detection.
  • recthalfsqznet (targets ULTRA96): The object detection network, that ranked second in DAC 2019 contest, delivering the highest FPS at lowest power consumption for object detection.

More Repositories

1

fpga-network-stack

Scalable Network Stack for FPGAs (TCP/IP, RoCEv2)
C++
733
star
2

Coyote

Framework providing operating system abstractions and a range of shared networking (RDMA, TCP/IP) and memory services to common modern heterogeneous platforms.
SystemVerilog
205
star
3

Vitis_with_100Gbps_TCP-IP

100 Gbps TCP/IP stack for Vitis shells
C++
152
star
4

caribou

Caribou: Distributed Smart Storage built with FPGAs
Verilog
62
star
5

davos

Distributed Accelerator OS
SystemVerilog
59
star
6

ZipML-PYNQ

Linear model training using stochastic gradient descent (SGD) on PYNQ with full to low precision.
Jupyter Notebook
53
star
7

doppiodb

doppioDB - A hardware accelerated database
C
45
star
8

ZipML-XeonFPGA

FPGA-based stochastic gradient descent (powered by ZipML - Low-precision machine learning on reconfigurable hardware)
VHDL
31
star
9

hacc

ETHZ Heterogeneous Accelerated Compute Cluster.
26
star
10

Centaur

Centaur, a framework for hybrid CPU-FPGA databases
Verilog
24
star
11

groundhog

Groundhog - Serial ATA Host Bus Adapter
Verilog
17
star
12

dma-driver

SystemVerilog
15
star
13

GPU-FPGA-Recommendation-System

FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters
C++
14
star
14

DecisionTrees

Decision Trees Inference
SystemVerilog
14
star
15

erbium

Business Rule Engine Hardware Accelerator
VHDL
13
star
16

FPGA-Recommendation-Accelerator

MLSys 2021 paper: MicroRec: efficient recommendation inference by hardware and data structure solutions
C++
13
star
17

fpga-hyperloglog

FPGA-based HyperLogLog Accelerator
C++
12
star
18

parti-fpga

FPGA-based data partitioning
VHDL
8
star
19

strega

An HTTP Server for FPGAs
C++
8
star
20

MLWeaving

Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-Precision Learning
C++
8
star
21

Distributed-DecisionTrees

SystemVerilog
8
star
22

ColumnML

Generalized linear model training on column-stores with FPGA-enhanced data transformation
VHDL
6
star
23

PipeArch

VHDL
6
star
24

SKT

A One-Pass Multi-Sketch Data Analytics Accelerator
C++
5
star
25

Distributed_Recommendation_Inference_on_FPGA_Clusters

C++
4
star
26

Coyote-CIRCT

Deploy CIRCT generated circuits with a streaming abstraction (circt-stream) effortlessly through Coyote.
SystemVerilog
4
star
27

hashing-XeonFPGA

Verilog
3
star
28

gcow

πŸ„ Gradient compression on the wire
C++
2
star
29

bit_serial_kmeans

SystemVerilog
2
star
30

loadbalancer

SystemVerilog
1
star
31

hw-acceleration-of-compression-and-crypto

C++
1
star
32

hacc-platform

Hardware ACCeleration Platform.
Shell
1
star