• Stars
    star
    412
  • Rank 104,406 (Top 3 %)
  • Language
    Haskell
  • Created about 13 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

FPGA Haskell machine with game changing performance. Reduceron is Matthew Naylor, Colin Runciman and Jason Reich's high performance FPGA softcore for running lazy functional programs, including hardware garbage collection. Reduceron has been implemented on various FPGAs with clock frequency ranging from 60 to 150 MHz depending on the FPGA. A high degree of parallelism allows Reduceron to implement graph evaluation very efficiently. This fork aims to continue development on this, with a view to practical applications. Comments, questions, etc are welcome.

Reduceron, an efficient processor for functional programs

WHAT IS REDUCERON?

Reduceron is a high performance FPGA softcore for running lazy functional programs, complete with hardware garbage collection. Reduceron has been implemented on various FPGAs with clock frequency ranging from 60 to 150 MHz depending on the FPGA. A high degree of parallelism allows Reduceron to implement graph evaluation very efficiently.

Reduceron is the work of Matthew Naylor, Colin Runciman and Jason Reich, who have kindly made their work available for others to use. Please see http://www.cs.york.ac.uk/fp/reduceron for supporting articles, memos, and original distribution.

OK, WHAT'S THIS THEN?

The present is a fork of the original distribution which intends to take the (York) Reduceron from the research prototype to the point where it can be useful for embedded projects and more.

The York Reduceron needs the following enhancements to meet our needs:

  1. The heap and program must (for the most parts) be kept in external memory, with FPGA block memory used for the stacks and heap and program caches.

    This simultaneously enables smaller and less expensive FPGAs to be used as well as allows for a much larger heap and larger programs.

  2. Access to memory mapped IO devices (and optionally, RAM).

  3. Richer set of primitives, including multiplication, shifts, logical and, or, ...

  4. Support for 32-bit integers - this greatly simplifies interfacing to existing IO devices and simplifies various numerical computations.

  5. Stack, update stack, [and case table stack?] should overflow into/underflow from external, allowing for orders of magnitude larger structures.

While Reduceron technically refers to the FPGA implementation, it is supported by

  • Flite: the F-lite to Red translator.
  • A Red emulator in C
  • Red Lava: Reduceron is a Red Lava program, which generate Verilog
  • Support for Verilog simulation and synthesis for various FPGA boards

As much of the history as was available has been gathered and Reduceron, Lava, and the Flite distribution have been merged into one repository.

HOW DO I USE IT?

The was last tested with Glasgow Haskell Compiler, Version 8.4.4 on macOS 10.14.3 and Linux, 64-bit.

Optionally: just run make in the toplevel directory and a large regression run will start. The Verilog simulation part will take weeks to finish.

To build:

make

Or run a specific test suite:

make -C programs $X

where $X is one of regress-emu, regress-flite-sim, regress-flite-comp, or regress-red-verilog-sim.

Note: the code generated by the C backend for Flite (used in the regress-flite-comp) depends on GCC features, such as nested functions. To build on macOS, install real gcc (say via Mac Homebrew) and invoke make as make CC=gcc-7 (assuming you installed version 7 of gcc).

To build a hardware version of a given test

cd fpga; make && flite -r ../programs/$P | ./Red -v

where $P is one of the programs (.hs). Next, build a Reduceron system for an FPGA board, fx the BeMicroCV A9:

make -C Reduceron/BeMicroCV-A9

Unfortunately programs can't currently be loaded dynamically but are baked into the FPGA image. It's a high priority goal to change that.

WHERE IS THIS GOING?

Plan

  1. Port to Verilog and remove Xilinx-isms. DONE!

  2. Shrink to fit mid-sized FPGA kits (eg. DE2-115 and BeMicroCV-A9). DONE!

  3. Rework Lava and the Reduceron implementation to be more composable and elastic; this means fewer or no global assumptions about timing. ONGOING!

  4. Support load/store to an external bus (the key difficulty is stalling while waiting on the bus).

  5. Use the program memory as a cache, making programs dynamically loadable and dramatically raise the size limits.

Eventual Plan

  • Move the heap [and tospace] to external memory
  • Add a heap cache/newspace memory
  • Implement the emu-32.c representation for the external heap
  • Much richer primitives
  • Haskell front-end

Long Term Plan

  • Research the design space; explore parallelism

OPEN QUESTIONS, with answers from Matthew:

Q1: Currently there doesn't seem an efficient way to handle toplevel variable bindings (CAFs). What did the York team have in mind there or does it require an extension? (Obviously one can treat them all other functional arguments, but that would mean a lot of parameters to pass around).

A1: "Some mechanism would be needed to construct graphs at a specified location on the heap at the beginning of program execution. The initial (unevaluated) graphs have constant size so can be linked to at compile time."

Q2: Why does Flite default to 0 for the MAXREGS parameter? Eg, why is

  redDefaults = CompileToRed 6 4 2 1 0

A2: (Historical reasons it would appear).

Q3: What happend to Memo 24?

A3: "I'd like to say it was our best kept secret, but in reality it probably got trashed :)"

tip for next commit

More Repositories

1

yarvi

Yet Another RISC-V Implementation
Roff
81
star
2

fpgammix

Partial implementation of Knuth's MMIX processor (FPGA softcore)
C
46
star
3

yari

YARI is a high performance open source FPGA soft-core RISC implementation, binary compatible with MIPS I. The distribution package includes a complete SoC, simulator, GDB stub, scripts, and various examples.
C
43
star
4

Paperlike-Raspberry-Pi-4

How to use a Dasung Paperlike HD-F, HD-FT, and Paperlike 253 with Raspberry Pi 4 [and other hosts?]
21
star
5

virtual-nascom

SDL-based Nascom 2 emulator
Assembly
11
star
6

BeMicro-CV

A "hello world" style designs for the Cyclone V based $49 Arrow BeMicro CV
VHDL
10
star
7

spleentt-5x8-font

Tiny 5x8 bitmap font based on spleen and creep, useful for low-resolution displays
Rust
9
star
8

verilator-demo

A very simple example of how to use Verilator
C++
6
star
9

NCL-examples

A collection of Null Convention Logic examples, simulated and synthesized for FPGA
Verilog
3
star
10

OrangeCrab_Hello

Simple OrangeCrab Verilog design using LED and serial IO
Verilog
2
star
11

expjit3

Proof of concept dynamic code generation
C
2
star
12

dirac-spec-errata

Bug-fixed version of the official specification of the Dirac wavelet based video codec
2
star
13

gdb-duel

DUEL - A high level language for debugging C programs (by Michael Golan)
1
star
14

bemicro_cva9_jtaguart

Small example design for BeMicro CV-A9 using JTAGUART and LEDs
Verilog
1
star
15

verilog-sim-bench

Verilog simulation workload extracted from Reduceron
Verilog
1
star
16

bp

Fun with branch predictors
Rust
1
star
17

rust-verilog-cosim

Small example of how to co-simulate a Rust model against a Verilog implementation, using Verilator
1
star
18

0toasic

Stuff I did for Matt Venn's Zero-to-ASIC course
Verilog
1
star
19

tinyc-in-rust

Marc Feeley's Tiny-C compiler, rewritten in Rust
Rust
1
star
20

jsnascom

Nascom 2 emulator in the browser
JavaScript
1
star
21

yarvi3

Slice
1
star
22

no-time-for-squares

VGA Clock Design For Tiny Tapeout 05
Verilog
1
star
23

lisp

A version of John McCarthy's tiny Lisp (in C) with added CDR-coding
C
1
star
24

multisim

MultiSim is Yet Another CPU Simulator which purpose in life is to allow easy experimentation with various implementation strategies, such as superscalar in-order, sscalar out-of-order, speculative sscalar out-of-order, etc.
C
1
star
25

kbe

Python
1
star