• Stars
    star
    229
  • Rank 174,666 (Top 4 %)
  • Language Verilog
  • Created about 6 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

It's a core. Made on Twitch.

twitchcore

A RISC-V core, first in Python, then in Verilog, then on FPGA.

Getting Started

Prerequisites

  • icarus-verilog
  • riscv-gnu-toolchain
brew install icarus-verilog riscv-gnu-toolchain

Installation

  1. Clone this repo
git clone https://github.com/geohot/twitchcore
cd twitchcore
  1. Clone and build riscv-tests
git clone https://github.com/riscv/riscv-tests
cd riscv-tests
git submodule update --init --recursive
autoconf
./configure
make
make install
cd ..
  1. Create a virtual environment (optional)
python3 -m venv env
source env/bin/activate
  1. Install Python packages
pip install -r requirements.txt

TODO

  • Fix unaligned loads/stores (I think this is good now, at least acceptable)
  • Make pipelining work
  • Add M instructions for fast multiply and divide
  • Add better introspection
  • Switch to 64-bit
  • Add "RISK" ML accelerator ("K" Standard Extension)

TODO (later)

  • Many ROBs like M1 go very fast

Notes on Memory system

8 million elements (20MB) = 23-bit address path

We want to support a load/store instruction into 32x32 matrix register (2432 bytes) like this:

  • Would be R-Type with rs1 and rs2 (64-bit)
  • rs1 contains the 23-bit base address, plus two masks in the upper bytes (0 is no mask)
  • rs2 contains two 24-bit strides for x and y. Several of these bits aren't connected
  • "rd" is the extension register to load into / store from

Use some hash function on the addresses to avoid "bank conflicts", can upper bound the fetch time.

Notes on ALU

matmul/mulacc are the big ones, 65536 FLOPS and 2048 FLOPS respectively

Have to think this through more with the reduce instructions too.

It's okay if the matmul takes multiple cycles I think, but the mulacc would be nice to be one.

matmul

  • load with stride 0 in X
  • mul
  • reduce

Notes on mini edition in 100T

16x16 registers (608 bytes), 256 FMACs (does it fit)

  • 128k elements = 17-bit address path
  • rs1 = 2x4-bit masks + 17-bit address
  • rs2 = 2x16-bit strides

How to run a RISC-V test:

You can run a risc-v test (source code available in firmwares) by:

 ./simulate.sh firmwares/add.hex    

More Repositories

1

qira

QEMU Interactive Runtime Analyser
C
3,806
star
2

fromthetransistor

From the Transistor to the Web Browser, a rough outline for a 12 week course
3,512
star
3

minikeyvalue

A distributed key value store in under 1000 lines. Used in production at comma.ai
Go
2,791
star
4

corona

Reverse engineering SARS-CoV-2
Python
2,450
star
5

ai-notebooks

Some ipython notebooks implementing AI algorithms
Jupyter Notebook
959
star
6

twitchslam

A toy implementation of monocular SLAM written while livestreaming
Python
941
star
7

configuration

Like some files bro
Haskell
379
star
8

tinyvoice

Letting computers listen to you and really care
Jupyter Notebook
361
star
9

twitchchess

like twitchslam, for chess
Python
349
star
10

lolrecaptcha

We try to break the recaptcha for the Merry Christmas for all!
Go
292
star
11

mergesorts

mergesort in many languages
Shell
254
star
12

cuda_ioctl_sniffer

Sniff CUDA ioctls
C
147
star
13

eda-reversing

The Embedded Disassembler
C++
110
star
14

kvm-kext

An implementation of /dev/kvm for Mac OS X
C
108
star
15

twitchcoq

It's a poorly named metamath verifier
Prolog
104
star
16

twitchtactoe

Tic Tac Toe in React because it is Simple Skills Sunday
JavaScript
102
star
17

battlechess

A distributed decentralized chess tournament
Python
99
star
18

tinyxxx

tiny corporation website
HTML
96
star
19

hammer-website

HTML
71
star
20

edgetpuxray

Enabling tinygrad compatibility with the Google Edge TPU
C++
68
star
21

pie

Computing digits of pi for the people
JavaScript
68
star
22

eda-2

Even better than eda-reversing...I hope
C++
61
star
23

haskell-scheme

Writing Scheme in Haskell
Haskell
58
star
24

twitchctw

compression = AI
Python
53
star
25

coq-hardy

Formalizing the Theorems from Hardy's "An Introduction to the Theory of Numbers" in coq
Coq
52
star
26

freethedsp

For winners only. Are you a winner?
C
40
star
27

twitchcoins

Python
36
star
28

openhexagon

An attempt at an open source toolchain for the Hexagon DSP
Shell
35
star
29

crappycase

So many shitty coders: Adobe, Blizzard, Valve. This is a case insensitivity emulator.
C
29
star
30

body_loop

comma body does a loop around the office
Python
28
star
31

amdgpu-dkms

Unpacking AMD's dkms packages
C
25
star
32

jenkyiphonetools

iPhone Tools of the lowest quality
Python
25
star
33

lowqualityraytracer

ever wonder how to raytrace? me too. i love america
Python
25
star
34

commaled

comma.ai LED controller cause the car needs some lights bro. SWAG!
Assembly
25
star
35

trinity-osxnew

C
22
star
36

boomgpt

The simplest way to run LLMs anywhere
20
star
37

aes_serial

There is so much swag in the world, just some of it is hidden -- Gandalf
C
17
star
38

eda-3

eda-3 from many years ago
JavaScript
13
star
39

collfun

It's Christmas time, you know what it is
Python
11
star
40

nnweights

7
star
41

7900xtx

5
star
42

gpysieve

ghetto sieves in python that don't work
Python
4
star
43

angr-travis

Run travis-ci testing on release version of angr
Shell
4
star
44

tt06-fp4-mac

FP4 MAC Array
Tcl
3
star
45

tinydreamer

An implementation of DreamerV3 in tinygrad
Python
2
star
46

tt-twitch

tenstorrent kernel from twitch
C++
2
star