• Stars
    star
    2,988
  • Rank 14,478 (Top 0.3 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created about 5 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Learn to write Rust procedural macros  [Rust Latam conference, Montevideo Uruguay, March 2019]

Rust Latam: procedural macros workshop

This repo contains a selection of projects designed to learn to write Rust procedural macros — Rust code that generates Rust code.

Each of these projects is drawn closely from a compelling real use case. Out of the 5 projects here, 3 are macros that I have personally implemented in industrial codebases for work, and the other 2 exist as libraries on crates.io by other authors.


Contents


Suggested prerequisites

This workshop covers attribute macros, derive macros, and function-like procedural macros.

Be aware that the content of the workshop and the explanations in this repo will assume a working understanding of structs, enums, traits, trait impls, generic parameters, and trait bounds. You are welcome to dive into the workshop with any level of experience with Rust, but you may find that these basics are far easier to learn for the first time outside of the context of macros.


Projects

Here is an introduction to each of the projects. At the bottom, I give recommendations for what order to tackle them based on your interests. Note that each of these projects goes into more depth than what is described in the introduction here.

Derive macro: derive(Builder)

This macro generates the boilerplate code involved in implementing the builder pattern in Rust. Builders are a mechanism for instantiating structs, especially structs with many fields, and especially if many of those fields are optional or the set of fields may need to grow backward compatibly over time.

There are a few different possibilities for expressing builders in Rust. Unless you have a strong pre-existing preference, to keep things simple for this project I would recommend following the example of the standard library's std::process::Command builder in which the setter methods each receive and return &mut self to allow chained method calls.

Callers will invoke the macro as follows.

use derive_builder::Builder;

#[derive(Builder)]
pub struct Command {
    executable: String,
    #[builder(each = "arg")]
    args: Vec<String>,
    current_dir: Option<String>,
}

fn main() {
    let command = Command::builder()
        .executable("cargo".to_owned())
        .arg("build".to_owned())
        .arg("--release".to_owned())
        .build()
        .unwrap();

    assert_eq!(command.executable, "cargo");
}

This project covers:

  • traversing syntax trees;
  • constructing output source code;
  • processing helper attributes to customize the generated code.

Project skeleton is located under the builder directory.

Derive macro: derive(CustomDebug)

This macro implements a derive for the standard library std::fmt::Debug trait that is more customizable than the similar Debug derive macro exposed by the standard library.

In particular, we'd like to be able to select the formatting used for individual struct fields by providing a format string in the style expected by Rust string formatting macros like format! and println!.

use derive_debug::CustomDebug;

#[derive(CustomDebug)]
pub struct Field {
    name: String,
    #[debug = "0b{:08b}"]
    bitmask: u8,
}

Here, one possible instance of the struct above might be printed by its generated Debug impl like this:

Field { name: "st0", bitmask: 0b00011100 }

This project covers:

  • traversing syntax trees;
  • constructing output source code;
  • processing helper attributes;
  • dealing with lifetime parameters and type parameters;
  • inferring trait bounds on generic parameters of trait impls;
  • limitations of derive's ability to emit universally correct trait bounds.

Project skeleton is located under the debug directory.

Function-like macro: seq!

This macro provides a syntax for stamping out sequentially indexed copies of an arbitrary chunk of code.

For example our application may require an enum with sequentially numbered variants like Cpu0 Cpu1 Cpu2 ... Cpu511. But note that the same seq! macro should work for any sort of compile-time loop; there is nothing specific to emitting enum variants. A different caller might use it for generating an expression like tuple.0 + tuple.1 + ... + tuple.511.

use seq::seq;

seq!(N in 0..512 {
    #[derive(Copy, Clone, PartialEq, Debug)]
    pub enum Processor {
        #(
            Cpu~N,
        )*
    }
});

fn main() {
    let cpu = Processor::Cpu8;

    assert_eq!(cpu as u8, 8);
    assert_eq!(cpu, Processor::Cpu8);
}

This project covers:

  • parsing custom syntax;
  • low-level representation of token streams;
  • constructing output source code.

Project skeleton is located under the seq directory.

Attribute macro: #[sorted]

A macro for when your coworkers (or you yourself) cannot seem to keep enum variants in sorted order when adding variants or refactoring. The macro will detect unsorted variants at compile time and emit an error pointing out which variants are out of order.

#[sorted]
#[derive(Debug)]
pub enum Error {
    BlockSignal(signal::Error),
    CreateCrasClient(libcras::Error),
    CreateEventFd(sys_util::Error),
    CreateSignalFd(sys_util::SignalFdError),
    CreateSocket(io::Error),
    DetectImageType(qcow::Error),
    DeviceJail(io_jail::Error),
    NetDeviceNew(virtio::NetError),
    SpawnVcpu(io::Error),
}

This project covers:

  • compile-time error reporting;
  • application of visitor pattern to traverse a syntax tree;
  • limitations of the currently stable macro API and some ways to work around them.

Project skeleton is located under the sorted directory.

Attribute macro: #[bitfield]

This macro provides a mechanism for defining structs in a packed binary representation with access to ranges of bits, similar to the language-level support for bit fields in C.

The macro will conceptualize one of these structs as a sequence of bits 0..N. The bits are grouped into fields in the order specified by a struct written by the caller. The #[bitfield] attribute rewrites the caller's struct into a private byte array representation with public getter and setter methods for each field.

The total number of bits N is required to be a multiple of 8 (this will be checked at compile time).

For example, the following invocation builds a struct with a total size of 32 bits or 4 bytes. It places field a in the least significant bit of the first byte, field b in the next three least significant bits, field c in the remaining four most significant bits of the first byte, and field d spanning the next three bytes.

use bitfield::*;

#[bitfield]
pub struct MyFourBytes {
    a: B1,
    b: B3,
    c: B4,
    d: B24,
}
                               least significant bit of third byte
                                 ┊           most significant
                                 ┊             ┊
                                 ┊             ┊
║  first byte   ║  second byte  ║  third byte   ║  fourth byte  ║
╟───────────────╫───────────────╫───────────────╫───────────────╢
║▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒║
╟─╫─────╫───────╫───────────────────────────────────────────────╢
║a║  b  ║   c   ║                       d                       ║
                 ┊                                             ┊
                 ┊                                             ┊
               least significant bit of d         most significant

The code emitted by the #[bitfield] macro for this struct would be as follows. Note that the field getters and setters use whichever of u8, u16, u32, u64 is the smallest while being at least as large as the number of bits in the field.

impl MyFourBytes {
    // Initializes all fields to 0.
    pub fn new() -> Self;

    // Field getters and setters:
    pub fn get_a(&self) -> u8;
    pub fn set_a(&mut self, val: u8);
    pub fn get_b(&self) -> u8;
    pub fn set_b(&mut self, val: u8);
    pub fn get_c(&self) -> u8;
    pub fn set_c(&mut self, val: u8);
    pub fn get_d(&self) -> u32;
    pub fn set_d(&mut self, val: u32);
}

This project covers:

  • traversing syntax trees;
  • processing helper attributes;
  • constructing output source code;
  • interacting with traits and structs other than from the standard library;
  • techniques for compile-time assertions that require type information, by leveraging the trait system in interesting ways from generated code;
  • tricky code.

Project skeleton is located under the bitfield directory.

Project recommendations

If this is your first time working with procedural macros, I would recommend starting with the derive(Builder) project. This will get you comfortable with traversing syntax trees and constructing output source code. These are the two fundamental components of a procedural macro.

After that, it would be equally reasonable to jump to any of derive(CustomDebug), seq!, or #[sorted].

  • Go for derive(CustomDebug) if you are interested in exploring how macros manipulate trait bounds, which is one of the most complicated aspects of code generation in Rust involving generic code like Serde. This project provides an approachable introduction to trait bounds and digs into many of the challenging aspects.

  • Go for seq! if you are interested in parsing a custom input syntax yourself. The other projects will all mostly rely on parsers that have already been written and distributed as a library, since their input is ordinary Rust syntax.

  • Go for #[sorted] if you are interested in generating diagnostics (custom errors) via a macro. Part of this project also covers a different way of processing input syntax trees; the other projects will do most things through if let. The visitor approach is better suited to certain types of macros involving statements or expressions as we'll see here when checking that match arms are sorted.

I would recommend starting on #[bitfield] only after you feel you have a strong grasp on at least two of the other projects. Note that completing the full intended design will involve writing at least one of all three types of procedural macros and substantially more code than the other projects.


Test harness

Testing macros thoroughly tends to be tricky. Rust and Cargo have a built-in testing framework via cargo test which can work for testing the success cases, but we also really care that our macros produce good error message when they detect a problem at compile time; Cargo isn't able to say that failing to compile is considered a success, and isn't able to compare that the error message produced by the compiler is exactly what we expect.

The project skeletons in this repository use an alternative test harness called trybuild.

The test harness is geared toward iterating on the implementation of a procedural macro, observing the errors emitted by failed executions of the macro, and testing that those errors are as expected.


Workflow

Every project has a test suite already written under its tests directory. (But feel free to add more tests, remove tests for functionality you don't want to implement, or modify tests as you see fit to align with your implementation.)

Run cargo test inside any of the 5 top-level project directories to run the test suite for that project.

Initially every projects starts with all of its tests disabled. Open up the project's tests/progress.rs file and enable tests one at a time as you work through the implementation. The test files (for example tests/01-parse.rs) each contain a comment explaining what functionality is tested and giving some tips for how to implement it. I recommend working through tests in numbered order, each time enabling one more test and getting it passing before moving on.

Tests come in two flavors: tests that should compile+run successfully, and tests that should fail to compile with a specific error message.

If a test should compile and run successfully, but fails, the test runner will surface the compiler error or runtime error output.

For tests that should fail to compile, we compare the compilation output against a file of expected errors for that test. If those errors match, the test is considered to pass. If they do not match, the test runner will surface the expected and actual output.

Expected output goes in a file with the same name as the test except with an extension of *.stderr instead of *.rs.

If there is no *.stderr file for a test that is supposed to fail to compile, the test runner will save the compiler's output into a directory called wip adjacent to the tests directory. So the way to update the "expected" output is to delete the existing *.stderr file, run the tests again so that the output is written to wip, and then move the new output from wip to tests.


Debugging tips

To look at what code a macro is expanding into, install the cargo expand Cargo subcommand and then run cargo expand in the repository root (outside of any of the project directories) to expand the main.rs file in that directory. You can copy any of the test cases into this main.rs and tweak it as you iterate on the macro.

If a macro is emitting syntactically invalid code (not just code that fails type-checking) then cargo expand will not be able to show it. Instead have the macro print its generated TokenStream to stderr before returning the tokens.

eprintln!("TOKENS: {}", tokens);

Then a cargo check in the repository root (if you are iterating using main.rs) or cargo test in the corresponding project directory will display this output during macro expansion.

Stderr is also a helpful way to see the structure of the syntax tree that gets parsed from the input of the macro.

eprintln!("INPUT: {:#?}", syntax_tree);

Note that in order for Syn's syntax tree types to provide Debug impls, you will need to set features = ["extra-traits"] on the dependency on Syn. This is because adding hundreds of Debug impls adds an appreciable amount of compile time to Syn, and we really only need this enabled while doing development on a macro rather than when the finished macro is published to users.


License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this codebase by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

More Repositories

1

cxx

Safe interop between Rust and C++
Rust
5,106
star
2

anyhow

Flexible concrete Error type built on std::error::Error
Rust
4,193
star
3

thiserror

derive(Error) for struct and enum error types
Rust
3,352
star
4

syn

Parser for Rust source code
Rust
2,574
star
5

cargo-expand

Subcommand to show result of macro expansion
Rust
2,433
star
6

async-trait

Type erasure for async trait methods
Rust
1,495
star
7

case-studies

Analysis of various tricky Rust code
Rust
1,340
star
8

rust-quiz

Medium to hard Rust questions with explanations
Rust
1,318
star
9

quote

Rust quasi-quoting
Rust
1,173
star
10

watt

Runtime for executing procedural macros as WebAssembly
Rust
1,062
star
11

typetag

Serde serializable and deserializable trait objects
Rust
888
star
12

paste

Macros for all your token pasting needs
Rust
852
star
13

serde-yaml

Strongly typed YAML library for Rust
Rust
804
star
14

no-panic

Attribute macro to require that the compiler prove a function can't ever panic
Rust
758
star
15

inventory

Typed distributed plugin registration
Rust
714
star
16

rust-toolchain

Concise GitHub Action for installing a Rust toolchain
Shell
621
star
17

trybuild

Test harness for ui tests of compiler diagnostics
Rust
615
star
18

miniserde

Data structure serialization library with several opposite design goals from Serde
Rust
612
star
19

reflect

Compile-time reflection API for developing robust procedural macros (proof of concept)
Rust
602
star
20

request-for-implementation

Crates that don't exist, but should
597
star
21

indoc

Indented document literals for Rust
Rust
537
star
22

prettyplease

A minimal `syn` syntax tree pretty-printer
Rust
517
star
23

erased-serde

Type-erased Serialize, Serializer and Deserializer traits
Rust
503
star
24

semver

Parser and evaluator for Cargo's flavor of Semantic Versioning
Rust
500
star
25

dyn-clone

Clone trait that is object-safe
Rust
486
star
26

ryu

Fast floating point to string conversion
Rust
471
star
27

linkme

Safe cross-platform linker shenanigans
Rust
399
star
28

semver-trick

How to avoid complicated coordinated upgrades
Rust
383
star
29

cargo-llvm-lines

Count lines of LLVM IR per generic function
Rust
368
star
30

efg

Conditional compilation using boolean expression syntax, rather than any(), all(), not()
Rust
297
star
31

rust-faq

Frequently Asked Questions · The Rust Programming Language
262
star
32

rustversion

Conditional compilation according to rustc compiler version
Rust
256
star
33

itoa

Fast function for printing integer primitives to a decimal string
Rust
248
star
34

path-to-error

Find out path at which a deserialization error occurred
Rust
241
star
35

cargo-tally

Graph the number of crates that depend on your crate over time
Rust
212
star
36

proc-macro-hack

Procedural macros in expression position
Rust
203
star
37

monostate

Type that deserializes only from one specific value
Rust
194
star
38

colorous

Color schemes for charts and maps
Rust
193
star
39

readonly

Struct fields that are made read-only accessible to other modules
Rust
187
star
40

dissimilar

Diff library with semantic cleanup, based on Google's diff-match-patch
Rust
175
star
41

star-history

Graph history of GitHub stars of a user or repo over time
Rust
156
star
42

ref-cast

Safely cast &T to &U where the struct U contains a single field of type T.
Rust
154
star
43

automod

Pull in every source file in a directory as a module
Rust
129
star
44

inherent

Make trait methods callable without the trait in scope
Rust
128
star
45

ghost

Define your own PhantomData
Rust
115
star
46

faketty

Wrapper to exec a command in a pty, even if redirecting the output
Rust
113
star
47

dtoa

Fast functions for printing floating-point primitives to a decimal string
Rust
110
star
48

clang-ast

Rust
108
star
49

seq-macro

Macro to repeat sequentially indexed copies of a fragment of code
Rust
102
star
50

remain

Compile-time checks that an enum or match is written in sorted order
Rust
99
star
51

mashup

Concatenate identifiers in a macro invocation
Rust
96
star
52

noisy-clippy

Rust
84
star
53

tt-call

Token tree calling convention
Rust
77
star
54

basic-toml

Minimal TOML library with few dependencies
Rust
76
star
55

squatternaut

A snapshot of name squatting on crates.io
Rust
73
star
56

serde-ignored

Find out about keys that are ignored when deserializing data
Rust
68
star
57

enumn

Convert number to enum
Rust
66
star
58

bootstrap

Bootstrapping rustc from source
Shell
62
star
59

essay

docs.rs as a publishing platform?
Rust
62
star
60

db-dump

Library for scripting analyses against crates.io's database dumps
Rust
60
star
61

scratch

Compile-time temporary directory shared by multiple crates and erased by `cargo clean`
Rust
59
star
62

gflags

Command line flags library that does not require a central list of all the flags
Rust
55
star
63

install

Fast `cargo install` action using a GitHub-based binary cache
Shell
55
star
64

oqueue

Non-interleaving multithreaded output queue
Rust
53
star
65

serde-starlark

Serde serializer for generating Starlark build targets
Rust
53
star
66

build-alert

Rust
51
star
67

unicode-ident

Determine whether characters have the XID_Start or XID_Continue properties
Rust
51
star
68

lalrproc

Proof of concept of procedural macro input parsed by LALRPOP
Rust
50
star
69

dragonbox

Rust
50
star
70

sha1dir

Checksum of a directory tree
Rust
38
star
71

hackfn

Fake implementation of `std::ops::Fn` for user-defined data types
Rust
38
star
72

reduce

iter.reduce(fn) in Rust
Rust
37
star
73

link-cplusplus

Link libstdc++ or libc++ automatically or manually
Rust
36
star
74

argv

Non-allocating iterator over command line arguments
Rust
33
star
75

get-all-crates

Download .crate files of all versions of all crates from crates.io
Rust
31
star
76

threadbound

Make any value Sync but only available on its original thread
Rust
31
star
77

dircnt

Count directory entries—`ls | wc -l` but faster
Rust
27
star
78

unsafe-libyaml

libyaml transpiled to rust by c2rust
Rust
27
star
79

serde-stacker

Serializer and Deserializer adapters that avoid stack overflows by dynamically growing the stack
Rust
27
star
80

cargo-unlock

Remove Cargo.lock lockfile
Rust
25
star
81

respan

Macros to erase scope information from tokens
Rust
24
star
82

isatty

libc::isatty that also works on Windows
Rust
21
star
83

iota

Related constants in Rust: 1 << iota
Rust
20
star
84

foreach

18
star
85

bufsize

bytes::BufMut implementation to count buffer size
Rust
18
star
86

hire

How to hire dtolnay
18
star
87

precise

Full precision decimal representation of f64
Rust
17
star
88

dashboard

15
star
89

rustflags

Parser for CARGO_ENCODED_RUSTFLAGS
Rust
13
star
90

libfyaml-rs

Rust binding for libfyaml
Rust
11
star
91

install-buck2

Install precompiled Buck2 build system
6
star
92

mailingset

Set-algebraic operations on mailing lists
Python
5
star
93

.github

5
star
94

jq-gdb

gdb pretty-printer for jv objects
Python
1
star