• Stars
    star
    657
  • Rank 68,589 (Top 2 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created over 13 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

RISC-V Opcodes

riscv-opcodes

This repo enumerates standard RISC-V instruction opcodes and control and status registers. It also contains a script to convert them into several formats (C, Scala, LaTeX).

Artifacts (encoding.h, latex-tables, etc) from this repo are used in other tools and projects like Spike, PK, RISC-V Manual, etc.

Project Structure

β”œβ”€β”€ constants.py    # contains variables, constants and data-structures used in parse.py
β”œβ”€β”€ encoding.h      # the template encoding.h file
β”œβ”€β”€ LICENSE         # license file
β”œβ”€β”€ Makefile        # makefile to generate artifacts
β”œβ”€β”€ parse.py        # python file to perform checks on the instructions and generate artifacts
β”œβ”€β”€ README.md       # this file
β”œβ”€β”€ rv*             # instruction opcode files
└── unratified      # contains unratified instruction opcode files

File Naming Policy

This project follows a very specific file structure to define the instruction encodings. All files containing instruction encodings start with the prefix rv. These files can either be present in the root directory (if the instructions have been ratified) of the unratified directory. The exact file-naming policy and location is as mentioned below:

  1. rv_x - contains instructions common within the 32-bit and 64-bit modes of extension X.
  2. rv32_x - contains instructions present in rv32x only (absent in rv64x e.g.. brev8)
  3. rv64_x - contains instructions present in rv64x only (absent in rv32x, e.g. addw)
  4. rv_x_y - contains instructions when both extension X and Y are available/enabled. It is recommended to follow canonical ordering for such file names as specified by the spec.
  5. unratified - this directory will also contain files similar to the above policies, but will correspond to instructions which have not yet been ratified.

When an instruction is present in multiple extensions and the spec is vague in defining the extension which owns the instruction, the instruction encoding must be placed in the first canonically ordered extension and should be imported(via the $import keyword) in the remaining extensions.

Encoding Syntax

The encoding syntax uses $ to indicate keywords. As of now 2 keywords have been identified : $import and $pseudo_op (described below). The syntax also uses :: as a means to define the relationship between extension and instruction. .. is used to defined bit ranges. We use # to define comments in the files. All comments must be in a separate line. In-line comments are not supported.

Instruction syntaxes used in this project are broadly categorized into three:

  • regular instructions :- these are instructions which hold a unique opcode in the encoding space. A very generic syntax guideline for these instructions is as follows:

    <instruction name> <instruction args> <bit-encodings>
    

    Examples:

    lui     rd imm20 6..2=0x0D 1..0=3
    beq     bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3
    

    The bit encodings are usually of 2 types:

    • single bit assignment : here the value of a single bit is assigned using syntax <bit-position>=<value>. For e.g. 6=1 means bit 6 should be 1. Here the value must be 1 or 0.
    • range assignment: here a range of bits is assigned a value using syntax: <msb>..<lsb>=<val>. For e.g. 31..24=0xab. The value here can be either unsigned integer, hex (0x) or binary (0b).
  • pseudo_instructions (a.k.a pseudo_ops) - These are instructions which are aliases of regular instructions. Their encodings force certain restrictions over the regular instruction. The syntax for such instructions uses the $pseudo_op keyword as follows:

    $pseudo_op <extension>::<base-instruction> <instruction name> <instruction args> <bit-encodings>
    

    Here the <extension> specifies the extension which contains the base instruction. <base-instruction> indicates the name of the instruction this pseudo-instruction is an alias of. The remaining fields are the same as the regular instruction syntax, where all the args and the fields of the pseudo instruction are specified.

    Example:

    $pseudo_op rv_zicsr::csrrs frflags rd 19..15=0 31..20=0x001 14..12=2 6..2=0x1C 1..0=3
    

    If a ratified instruction is a pseudo_op of a regular unratified instruction, it is recommended to maintain this pseudo_op relationship i.e. define the new instruction as a pseudo_op of the unratified regular instruction, as this avoids existence of overlapping opcodes for users who are experimenting with unratified extensions as well.

  • imported_instructions - these are instructions which are borrowed from an extension into a new/different extension/sub-extension. Only regular instructions can be imported. Pseudo-op or already imported instructions cannot be imported. Example:

    $import rv32_zkne::aes32esmi
    

RESTRICTIONS

Following are the restrictions one should keep in mind while defining $pseudo_ops and $imported_ops

  • Pseudo-op or already imported instructions cannot be imported again in another file. One should always import base-instructions only.
  • While defining a $pseudo_op, the base-instruction itself cannot be a $pseudo_op

Flow for parse.py

The parse.py python file is used to perform checks on the current set of instruction encodings and also generates multiple artifacts : latex tables, encoding.h header file, etc. This section will provide a brief overview of the flow within the python file.

To start with, parse.py creates a list of all rv* files currently checked into the repo (including those inside the unratified directory as well). It then starts parsing each file line by line. In the first pass, we only capture regular instructions and ignore the imported or pseudo instructions. For each regular instruction, the following checks are performed :

  • for range-assignment syntax, the msb position must be higher than the lsb position
  • for range-assignment syntax, the value of the range must representable in the space identified by msb and lsb
  • values for the same bit positions should not be defined multiple times.
  • All bit positions must be accounted for (either as args or constant value fields)

Once the above checks are passed for a regular instruction, we then create a dictionary for this instruction which contains the following fields:

  • encoding : contains a 32-bit string defining the encoding of the instruction. Here - is used to represent instruction argument fields
  • extension : string indicating which extension/filename this instruction was picked from
  • mask : a 32-bit hex value indicating the bits of the encodings that must be checked for legality of that instruction
  • match : a 32-bit hex value indicating the values the encoding must take for the bits which are set as 1 in the mask above
  • variable_fields : This is list of args required by the instruction

The above dictionary elements are added to a main instr_dict dictionary under the instruction node. This process continues until all regular instructions have been processed. In the second pass, we now process the $pseudo_op instructions. Here, we first check if the base-instruction of this pseudo instruction exists in the relevant extension/filename or not. If it is present, the the remaining part of the syntax undergoes the same checks as above. Once the checks pass and if the base-instruction is not already added to the main instr_dict then the pseudo-instruction is added to the list. In the third, and final, pass we process the imported instructions.

The case where the base-instruction for a pseudo-instruction may not be present in the main instr_dict after the first pass is if the only a subset of extensions are being processed such that the base-instruction is not included.

Artifact Generation and Usage

The following artifacts can be generated using parse.py:

  • instr_dict.yaml : This is file generated always by parse.py and contains the entire main dictionary instr\_dict in YAML format. Note, in this yaml the dots in an instruction are replaced with underscores
  • encoding.out.h : this is the header file that is used by tools like spike, pk, etc
  • instr-table.tex : the latex table of instructions used in the riscv-unpriv spec
  • priv-instr-table.tex : the latex table of instruction used in the riscv-priv spec
  • inst.chisel : chisel code to decode instructions
  • inst.sverilog : system verilog code to decode instructions
  • inst.rs : rust code containing mask and match variables for all instructions
  • inst.spinalhdl : spinalhdl code to decode instructions
  • inst.go : go code to decode instructions

Make sure you install the required python pre-requisites are installed by executing the following command:

sudo apt-get install python-pip3
pip3 install -r requirements.txt

To generate all the above artifacts for all instructions currently checked in, simply run make from the root-directory. This should print the following log on the command-line:

Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv*', 'unratified/rv*']
Extensions selected : ['rv*', 'unratified/rv*']
INFO:: encoding.out.h generated successfully
INFO:: inst.chisel generated successfully
INFO:: inst.spinalhdl generated successfully
INFO:: inst.sverilog generated successfully
INFO:: inst.rs generated successfully
INFO:: inst.go generated successfully
INFO:: instr-table.tex generated successfully
INFO:: priv-instr-table.tex generated successfully

By default all extensions are enabled. To select only a subset of extensions you can change the EXTENSIONS variable of the makefile to contains only the file names of interest. For example if you want only the I and M extensions you can do the following:

make EXTENSIONS='rv*_i rv*_m' 

Which will print the following log:

Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m']
Extensions selected : ['rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m']
INFO:: encoding.out.h generated successfully
INFO:: inst.chisel generated successfully
INFO:: inst.sverilog generated successfully
INFO:: inst.rs generated successfully
INFO:: instr-table.tex generated successfully
INFO:: priv-instr-table.tex generated successfully

If you only want a specific artifact you can use one or more of the following targets : c, rust, chisel, sverilog, latex

You can use the clean target to remove all artifacts.

Adding a new extension

To add a new extension of instructions, create an appropriate rv* file based on the policy defined in File Structure. Run make from the root directory to ensure that all checks pass and all artifacts are created correctly. A successful run should print the following log on the terminal:

Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv*', 'unratified/rv*']
Extensions selected : ['rv*', 'unratified/rv*']
INFO:: encoding.out.h generated successfully
INFO:: inst.chisel generated successfully
INFO:: inst.sverilog generated successfully
INFO:: inst.rs generated successfully
INFO:: instr-table.tex generated successfully
INFO:: priv-instr-table.tex generated successfully

Create a PR for review.

Enabling Debug logs in parse.py

To enable debug logs in parse.py change level=logging.INFO to level=logging.DEBUG and run the python command. You will now see debug statements on the terminal like below:

DEBUG:: Collecting standard instructions first
DEBUG:: Parsing File: ./rv_i
DEBUG::      Processing line: lui     rd imm20 6..2=0x0D 1..0=3
DEBUG::      Processing line: auipc   rd imm20 6..2=0x05 1..0=3
DEBUG::      Processing line: jal     rd jimm20                          6..2=0x1b 1..0=3
DEBUG::      Processing line: jalr    rd rs1 imm12              14..12=0 6..2=0x19 1..0=3
DEBUG::      Processing line: beq     bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3
DEBUG::      Processing line: bne     bimm12hi rs1 rs2 bimm12lo 14..12=1 6..2=0x18 1..0=3

How do I find where an instruction is defined?

You can use grep "^\s*<instr-name>" rv* unratified/rv* OR run make and open instr_dict.yaml and search of the instruction you are looking for. Within that instruction the extension field will indicate which file the instruction was picked from.

More Repositories

1

riscv-isa-manual

RISC-V Instruction Set Manual
TeX
3,546
star
2

riscv-v-spec

Working draft of the proposed RISC-V V vector extension
Assembly
961
star
3

learn

Tracking RISC-V Actions on Education, Training, Courses, Monitorships, etc.
462
star
4

sail-riscv

Sail RISC-V model
Coq
455
star
5

riscv-debug-spec

Working Draft of the RISC-V Debug Specification Standard
Python
454
star
6

riscv-crypto

RISC-V cryptography extensions standardisation work.
C
358
star
7

meta-riscv

OpenEmbedded/Yocto layer for RISC-V Architecture
BitBake
350
star
8

riscv-fast-interrupt

Proposal for a RISC-V Core-Local Interrupt Controller (CLIC)
Makefile
236
star
9

riscv-bitmanip

Working draft of the proposed RISC-V Bitmanipulation extension
Makefile
205
star
10

riscv-j-extension

Working Draft of the RISC-V J Extension Specification
Makefile
160
star
11

riscv-p-spec

RISC-V Packed SIMD Extension
138
star
12

riscv-plic-spec

PLIC Specification
125
star
13

riscv-profiles

RISC-V Architecture Profiles
Makefile
104
star
14

riscv-cfi

This repo holds the work area and revisions of the RISC-V CFI (Shadow Stack and Landing Pads) specifications. CFI defines the privileged and unprivileged ISA extensions that can be used by privileged and unprivileged programs to protect the integrity of their control-flow.
Makefile
83
star
15

docs-dev-guide

Documentation developer guide
TeX
79
star
16

riscv-CMOs

HTML
77
star
17

riscv-aia

77
star
18

riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.
Python
50
star
19

riscv-test-env

C
40
star
20

riscv-aclint

Makefile
39
star
21

virtual-memory

35
star
22

configuration-structure

RISC-V Configuration Structure
Python
35
star
23

riscv-smmtt

This specification will define the RISC-V privilege ISA extensions required to support Supervisor Domain isolation for multi-tenant security use cases e.g. confidential-computing, trusted platform services, fault isolation and so on.
Makefile
26
star
24

riscv-bfloat16

Makefile
26
star
25

docs-resources

19
star
26

docs-spec-template

Makefile
18
star
27

riscv-control-transfer-records

This repo contains a RISC-V ISA extension (proposal) to allow recording of control transfer history to on-chip registers, to support usages associated with profiling and debug.
Makefile
14
star
28

riscv-smbios

RISC-V SMBIOS Type 44 Spec
TeX
13
star
29

riscv-double-trap

RISC-V Double Trap Fast-Track Extension
Makefile
12
star
30

riscv-spmp

The repo contains the SPMP architectural specification, which includes capabilities like access control of read/write/execute requests by an hart, address matching, encoding of permissions, exceptions for access violation, and support for virtualization.
TeX
11
star
31

riscv-docs-base-container-image

A base container image populated with the dependencies to build the RISC-V Documentation.
9
star
32

riscv-zabha

The Zabha extension provides support for byte and halfword atomic memory operations.
Makefile
8
star
33

riscv-attached-matrix-facility

Attached Matrix Facility Specification
Makefile
6
star
34

riscv-glossary

Makefile
6
star
35

riscv-zilsd

Zilsd (Load/Store Pair for RV32) Fast-Track Extension
Makefile
6
star
36

riscv-software-ecosystem

A curated list of the status of different softwares on RISC-V
5
star
37

riscv-performance-events

RISC-V Performance Events Specification
Makefile
4
star
38

riscv-b

"B" extension - that represents the collection of the Zba, Zbb, and Zbs extensions
Makefile
4
star
39

riscv-zaamo-zalrsc

Zaamo / Zalrsc: A extension components
Makefile
4
star
40

riscv-ssqosid

This repo will hold the specification for the proposed QoS ID extension being pursued on the fast-track process.
Makefile
2
star
41

.github

2
star
42

riscv-svvptc

Obviating Memory-Management Instructions after Marking PTEs Valid (Svvptc)
Makefile
2
star
43

riscv-memory-tagging

Memory Tagging ISA extension that can be used by software to enforce memory tag checks on memory loads and stores
Makefile
2
star
44

riscv-dot-product

Dot-Product Extension
Makefile
2
star
45

riscv-smcdeleg-ssccfg

Supervisor Counter Delegation Architecture Extension
Makefile
2
star
46

composable-custom-extensions

This task group will propose ISA extension(s) and non-ISA hardware and software interop interfaces to enable routine reuse and composition of a subcategory of custom extensions called composable extensions.
Makefile
1
star
47

riscv-ssrastraps

The RAS exception and interrupts extension (Ssrastraps) defines standard local interrupt numbers and exception-cause codes for reporting errors detected by RAS functions in the system.
TeX
1
star
48

riscv-zalasr

The ISA specification for the Zalasr extension.
Makefile
1
star
49

riscv-pqc

Post Quantum Cryptography
Makefile
1
star
50

riscv-ssdtso

The Ssdtso is a fast-track extension adding a 'dynamic-RVTSO' mode of operation and on-demand per-hart switching between the memory models.
Makefile
1
star
51

lightweight-isolation

The Lightweight Isolation Specification
Makefile
1
star
52

riscv-hac

High Assurance Cryptography
Makefile
1
star
53

riscv-ras-eri

The (RAS Error-record Register Interface) RERI provides a specification to augment RAS features in RISC-V SOC hardware to standardize reporting and logging of errors by means of a memory-mapped register interface to enable error detection, provide the facility to log the detected errors (including their severity, nature, and location), and configuring means to report the error to a handler component.
TeX
1
star