• Stars
    star
    602
  • Rank 74,409 (Top 2 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Compile-time reflection API for developing robust procedural macros (proof of concept)

I thought Rust doesn't have reflection...?

This crate explores what it could look like to tackle the 80% use case of custom derive macros through a programming model that resembles compile-time reflection.

Motivation

My existing syn and quote libraries approach the problem space of procedural macros in a super general way and are a good fit for maybe 95% of use cases. However, the generality comes with the cost of operating at a relatively low level of abstraction. The macro author is responsible for the placement of every single angle bracket, lifetime, type parameter, trait bound, and phantom data. There is a large amount of domain knowledge involved and very few people can reliably produce robust macros with this approach.

The design explored here focuses on what it would take to make all the edge cases disappear -- such that if your macro works for the most basic case, then it also works in every tricky case under the sun.

Programming model

The idea is that we expose what looks like a boring straightforward runtime reflection API such as you might recognize if you have used reflection in Java or reflection in Go.

The macro author expresses the logic of their macro in terms of this API, using types like reflect::Value to retrieve function arguments and access fields of data structures and invoke functions and so forth. Importantly, there is no such thing as a generic type or phantom data in this model. Everything is just a reflect::Value with a type that is conceptually its monomorphized type at runtime.

Meanwhile the library is tracking the control flow and function invocations to build up a fully general and robust procedural implementation of the author's macro. The resulting code will have all the angle brackets and lifetimes and bounds and phantom types in the right places without the macro author thinking about any of that.

The reflection API is just a means for defining a procedural macro. The library boils it all away and emits clean Rust source code free of any actual runtime reflection. Note that this is not a statement about compiler optimizations -- we are not relying on the Rust compiler to do heroic optimizations on shitty generated code. Literally the source code authored through the reflection API will be what a seasoned macro author would have produced simply using syn and quote.

From the perspective of the person that ends up calling the macro, everything about how it is called is the same as if the macro were written the old fashioned way without reflection, and their code compiles exactly as fast and performs exactly as fast. The advantage is to the macro author for whom developing and maintaining a robust macro is greatly simplified.

Demo

This project contains a proof of concept of a compile-time reflection API for defining custom derives.

The tests/debug/ directory demonstrates a working compilable implementation of #[derive(Debug)] for structs with named fields. The corresponding test case shows what code we emit when deriving Debug for a struct Point with two fields; it is equivalent to the code that a handwritten derive(Debug) macro without reflection would emit for the same data structure.

The macro implementation begins with a DSL declaration of the types and functions that will be required at runtime:

reflect::library! {
    extern crate std {
        mod fmt {
            type Formatter;
            type Result;
            type DebugStruct;

            trait Debug {
                fn fmt(&self, &mut Formatter) -> Result;
            }

            impl Formatter {
                fn debug_struct(&mut self, &str) -> DebugStruct;
            }

            impl DebugStruct {
                fn field(&mut self, &str, &Debug) -> &mut DebugStruct;
                fn finish(&mut self) -> Result;
            }
        }
    }
}

There may be additional extern crate blocks here if we need to use types from outside the standard library. For example Serde's #[derive(Serialize)] macro would want to list the serde crate, the Serialize and Serializer types, and whichever of their methods will possibly be invoked at runtime.

Throughout the rest of the macro implementation, all type information is statically inferred based on the signatures given in this library declaration.

Next, the macro entry point is an ordinary proc_macro_derive function just as it would be for a derive macro defined any other way.

Once again the reflection API is just a means for defining a procedural macro. Despite what it may look like below, everything written here executes at compile time. The reflect library spits out generated code in an output TokenStream that is compiled into the macro user's crate. This token stream contains no vestiges of runtime reflection.

use proc_macro::TokenStream;

// Macro that is called when someone writes derive(MyDebug) on a data structure.
// It returns a fragment of Rust source code (TokenStream) containing an
// implementation of Debug for the input data structure. The macro uses
// compile-time reflection internally, but the generated Debug impl is exactly
// as if this macro were handwritten without reflection.
#[proc_macro_derive(MyDebug)]
pub fn derive(input: TokenStream) -> TokenStream {
    // Feed the tokens describing the data structure into the reflection library
    // for parsing and analysis. We provide a callback that describes what trait
    // impl(s) the reflection library will need to generate code for.
    reflect::derive(input, |ex| {
        // Instruct the library to generate an impl of Debug for the derive
        // macro's target type / Self type.
        ex.make_trait_impl(RUNTIME::std::fmt::Debug, ex.target_type(), |block| {
            // Instruct the library to compile debug_fmt (a function shown
            // below) into the source code for the impl's Debug::fmt method.
            block.make_function(RUNTIME::std::fmt::Debug::fmt, debug_fmt);
        });
    })
}

The following looks like a function that does runtime reflection. It receives function arguments which have the type reflect::Value and can pass them around, pull out their fields, inspect attributes, invoke methods, and so forth.

use reflect::*;

// This function will get compiled into Debug::fmt, which has this signature:
//
//     fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result
//
fn debug_fmt(f: MakeFunction) -> Value {
    let receiver: reflect::Value = f.arg(0);  // this is `self`
    let formatter: reflect::Value = f.arg(1);

    // The input value may be any of unit struct, tuple struct, ordinary braced
    // struct, or enum.
    match receiver.data() {
        Data::Struct(receiver) => match receiver {
            Struct::Unit(receiver) => unimplemented!(),
            Struct::Tuple(receiver) => unimplemented!(),
            Struct::Struct(receiver) => {
                /* implemented below */
            }
        },
        // For an enum, the active variant of the enum may be any of unit
        // variant, tuple variant, or struct variant.
        Data::Enum(receiver) => receiver.match_variant(|variant| match variant {
            Variant::Unit(variant) => unimplemented!(),
            Variant::Tuple(variant) => unimplemented!(),
            Variant::Struct(variant) => unimplemented!(),
        }),
    }
}

In the case of a struct with named fields we use reflection to loop over fields of the struct and invoke methods of the standard library Formatter API to append each field value into the debug output.

Refer to the DebugStruct example code in the standard library API documentation for what this is supposed to do at runtime.

Paths beginning with RUNTIME:: refer to library signatures declared by the library! { ... } snippet above.

let builder = RUNTIME::std::fmt::Formatter::debug_struct
    .INVOKE(formatter, type_name)
    .reference_mut();

for field in receiver.fields() {
    RUNTIME::std::fmt::DebugStruct::field.INVOKE(
        builder,
        field.get_name(),
        field.get_value(),
    );
}

RUNTIME::std::fmt::DebugStruct::finish.INVOKE(builder)

The reflection library is able to track how reflect::Value objects flow from one INVOKE to another, and contains a compiler that can compile this data flow into strongly typed Rust source code in a robust way. In the case of the Debug derive macro from this demo, when invoked on a braced struct with two fields,

#[derive(MyDebug)]
struct Point {
    x: i32,
    y: i32,
}

the reflection library would emit a trait impl that looks like this:

// expands to:
impl ::std::fmt::Debug for Point {
    fn fmt(&self, _arg1: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        match *self {
            Point { x: ref _v0, y: ref _v1 } => {
                let mut _v2 = ::std::fmt::Formatter::debug_struct(_arg1, "Point");
                let _ = ::std::fmt::DebugStruct::field(&mut _v2, "x", _v0);
                let _ = ::std::fmt::DebugStruct::field(&mut _v2, "y", _v1);
                let _v3 = ::std::fmt::DebugStruct::finish(&mut _v2);
                _v3
            }
        }
    }
}

This generated code is what ends up running at runtime. Notice that there is no reflection. In fact this is pretty much identical to what the standard library's built-in derive(Debug) macro produces for the same data structure.

Robustness and how things go wrong

I mentioned above about how implementing robust macros simply using syn and quote is quite challenging.

The example I like to use is taking a single struct field and temporarily wrapping it in a new struct. This is a real life use case drawn from how serde_derive handles serialize_with attributes. Conceptually:

let input: DeriveInput = syn::parse(...).unwrap();

// Pull out one of the field types.
let type_of_field_x: syn::Type = /* ... */;

quote! {
    // Very not robust.
    struct Wrapper<'a> {
        x: &'a #type_of_field_x,
    }

    Wrapper { x: &self.x }
}

Making the quote! part of this simply generate compilable code for all possible values of type_of_field_x is extremely involved. The macro author needs to consider and handle all of the following in order to make this work reliably:

  • Lifetime parameters used by type_of_field_x,
  • Type parameters used by type_of_field_x,
  • Associated types used by type_of_field_x,
  • Where-clauses on input that constrain any of the above,
  • Similarly, trait bounds on type parameters of input,
  • Where-clauses or bounds affecting any other fields of input,
  • Type parameter defaults on input that need to be stripped.

In contrast, the reflect library will be able to get it right every single time with much less thought from the macro author. Possibly as trivial as:

let wrapper: reflect::Type = reflect::new_struct_type();

wrapper.instantiate(vec![input.get_field("x").reference()])

Remaining work

In its current state the proof of concept generates just barely working code for our simple Debug derive. The reflect library needs more work to produce robust code in the presence of lifetimes and generic parameters, and for library signatures involving more complicated types.

Crucially all remaining work should happen without touching the code of our Debug derive. The promise of reflect is that if the macro works for the most basic cases (which the code above already does) then it also works in all the edge cases. From here it is reflect's responsibility to compile the dead simple reflection-like reflect::Value object manipulations into a fully general and robust procedural macro.


License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

More Repositories

1

cxx

Safe interop between Rust and C++
Rust
5,106
star
2

anyhow

Flexible concrete Error type built on std::error::Error
Rust
4,193
star
3

thiserror

derive(Error) for struct and enum error types
Rust
3,352
star
4

proc-macro-workshop

Learn to write Rust procedural macros  [Rust Latam conference, Montevideo Uruguay, March 2019]
Rust
2,988
star
5

syn

Parser for Rust source code
Rust
2,681
star
6

cargo-expand

Subcommand to show result of macro expansion
Rust
2,644
star
7

async-trait

Type erasure for async trait methods
Rust
1,495
star
8

case-studies

Analysis of various tricky Rust code
Rust
1,340
star
9

rust-quiz

Medium to hard Rust questions with explanations
Rust
1,318
star
10

quote

Rust quasi-quoting
Rust
1,231
star
11

watt

Runtime for executing procedural macros as WebAssembly
Rust
1,062
star
12

typetag

Serde serializable and deserializable trait objects
Rust
888
star
13

paste

Macros for all your token pasting needs
Rust
852
star
14

serde-yaml

Strongly typed YAML library for Rust
Rust
804
star
15

no-panic

Attribute macro to require that the compiler prove a function can't ever panic
Rust
758
star
16

inventory

Typed distributed plugin registration
Rust
714
star
17

rust-toolchain

Concise GitHub Action for installing a Rust toolchain
Shell
621
star
18

trybuild

Test harness for ui tests of compiler diagnostics
Rust
615
star
19

miniserde

Data structure serialization library with several opposite design goals from Serde
Rust
612
star
20

request-for-implementation

Crates that don't exist, but should
597
star
21

proc-macro2

Rust
545
star
22

indoc

Indented document literals for Rust
Rust
537
star
23

prettyplease

A minimal `syn` syntax tree pretty-printer
Rust
517
star
24

erased-serde

Type-erased Serialize, Serializer and Deserializer traits
Rust
503
star
25

semver

Parser and evaluator for Cargo's flavor of Semantic Versioning
Rust
500
star
26

dyn-clone

Clone trait that is object-safe
Rust
486
star
27

ryu

Fast floating point to string conversion
Rust
471
star
28

linkme

Safe cross-platform linker shenanigans
Rust
399
star
29

cargo-llvm-lines

Count lines of LLVM IR per generic function
Rust
398
star
30

semver-trick

How to avoid complicated coordinated upgrades
Rust
383
star
31

efg

Conditional compilation using boolean expression syntax, rather than any(), all(), not()
Rust
297
star
32

rust-faq

Frequently Asked Questions Β· The Rust Programming Language
262
star
33

rustversion

Conditional compilation according to rustc compiler version
Rust
256
star
34

itoa

Fast function for printing integer primitives to a decimal string
Rust
248
star
35

path-to-error

Find out path at which a deserialization error occurred
Rust
241
star
36

cargo-tally

Graph the number of crates that depend on your crate over time
Rust
212
star
37

proc-macro-hack

Procedural macros in expression position
Rust
203
star
38

monostate

Type that deserializes only from one specific value
Rust
194
star
39

colorous

Color schemes for charts and maps
Rust
193
star
40

readonly

Struct fields that are made read-only accessible to other modules
Rust
187
star
41

dissimilar

Diff library with semantic cleanup, based on Google's diff-match-patch
Rust
175
star
42

star-history

Graph history of GitHub stars of a user or repo over time
Rust
156
star
43

ref-cast

Safely cast &T to &U where the struct U contains a single field of type T.
Rust
154
star
44

automod

Pull in every source file in a directory as a module
Rust
129
star
45

inherent

Make trait methods callable without the trait in scope
Rust
128
star
46

ghost

Define your own PhantomData
Rust
115
star
47

faketty

Wrapper to exec a command in a pty, even if redirecting the output
Rust
113
star
48

dtoa

Fast functions for printing floating-point primitives to a decimal string
Rust
110
star
49

clang-ast

Rust
108
star
50

seq-macro

Macro to repeat sequentially indexed copies of a fragment of code
Rust
102
star
51

remain

Compile-time checks that an enum or match is written in sorted order
Rust
99
star
52

mashup

Concatenate identifiers in a macro invocation
Rust
96
star
53

noisy-clippy

Rust
84
star
54

tt-call

Token tree calling convention
Rust
77
star
55

basic-toml

Minimal TOML library with few dependencies
Rust
76
star
56

squatternaut

A snapshot of name squatting on crates.io
Rust
73
star
57

serde-ignored

Find out about keys that are ignored when deserializing data
Rust
68
star
58

enumn

Convert number to enum
Rust
66
star
59

bootstrap

Bootstrapping rustc from source
Shell
62
star
60

essay

docs.rs as a publishing platform?
Rust
62
star
61

db-dump

Library for scripting analyses against crates.io's database dumps
Rust
60
star
62

scratch

Compile-time temporary directory shared by multiple crates and erased by `cargo clean`
Rust
59
star
63

gflags

Command line flags library that does not require a central list of all the flags
Rust
55
star
64

install

Fast `cargo install` action using a GitHub-based binary cache
Shell
55
star
65

serde-starlark

Serde serializer for generating Starlark build targets
Rust
53
star
66

oqueue

Non-interleaving multithreaded output queue
Rust
53
star
67

build-alert

Rust
51
star
68

unicode-ident

Determine whether characters have the XID_Start or XID_Continue properties
Rust
51
star
69

lalrproc

Proof of concept of procedural macro input parsed by LALRPOP
Rust
50
star
70

dragonbox

Rust
50
star
71

sha1dir

Checksum of a directory tree
Rust
38
star
72

hackfn

Fake implementation of `std::ops::Fn` for user-defined data types
Rust
38
star
73

reduce

iter.reduce(fn) in Rust
Rust
37
star
74

link-cplusplus

Link libstdc++ or libc++ automatically or manually
Rust
36
star
75

argv

Non-allocating iterator over command line arguments
Rust
33
star
76

get-all-crates

Download .crate files of all versions of all crates from crates.io
Rust
31
star
77

threadbound

Make any value Sync but only available on its original thread
Rust
31
star
78

dircnt

Count directory entriesβ€”`ls | wc -l` but faster
Rust
27
star
79

unsafe-libyaml

libyaml transpiled to rust by c2rust
Rust
27
star
80

serde-stacker

Serializer and Deserializer adapters that avoid stack overflows by dynamically growing the stack
Rust
27
star
81

cargo-unlock

Remove Cargo.lock lockfile
Rust
25
star
82

respan

Macros to erase scope information from tokens
Rust
24
star
83

isatty

libc::isatty that also works on Windows
Rust
21
star
84

iota

Related constants in Rust: 1 << iota
Rust
20
star
85

foreach

18
star
86

bufsize

bytes::BufMut implementation to count buffer size
Rust
18
star
87

hire

How to hire dtolnay
18
star
88

precise

Full precision decimal representation of f64
Rust
17
star
89

dashboard

15
star
90

rustflags

Parser for CARGO_ENCODED_RUSTFLAGS
Rust
13
star
91

libfyaml-rs

Rust binding for libfyaml
Rust
11
star
92

install-buck2

Install precompiled Buck2 build system
6
star
93

mailingset

Set-algebraic operations on mailing lists
Python
5
star
94

.github

5
star
95

jq-gdb

gdb pretty-printer for jv objects
Python
1
star