• Stars
    star
    112
  • Rank 312,240 (Top 7 %)
  • Language
    Rust
  • Created over 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Heinous hackery to concatenate constant strings.

Const string concatenation

Rust has some great little magic built-in macros that you can use. A particularly-helpful one for building up paths and other text at compile-time is concat!. This takes two strings and returns the concatenation of them:

const HELLO_WORLD: &str = concat!("Hello", ", ", "world!");

assert_eq!(HELLO_WORLD, "Hello, world!");

This is nice, but it falls apart pretty quickly. You can use concat! on the strings returned from magic macros like env! and include_str! but you can't use it on constants:

const GREETING: &str = "Hello";
const PLACE: &str = "world";
const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");

This produces the error:

error: expected a literal
 --> src/main.rs:3:35
  |
3 | const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");
  |                                   ^^^^^^^^

error: expected a literal
 --> src/main.rs:3:51
  |
3 | const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");
  |                                                   ^^^^^

Well with const_concat! you can! It works just like the concat! macro:

#[macro_use]
extern crate const_concat;

const GREETING: &str = "Hello";
const PLACE: &str = "world";
const HELLO_WORLD: &str = const_concat!(GREETING, ", ", PLACE, "!");

assert_eq!(HELLO_WORLD, "Hello, world!");

All this, and it's implemented entirely without hooking into the compiler. So how does it work? Through dark, evil magicks. Firstly, why can't this just work the same as runtime string concatenation? Well, runtime string concatenation allocates a new String, but allocation isn't possible at compile-time - we have to do everything on the stack. Also, we can't do iteration at compile-time so there's no way to copy the characters from the source strings to the destination string. Let's look at the implementation. The "workhorse" of this macro is the concat function:

pub const unsafe fn concat<First, Second, Out>(a: &[u8], b: &[u8]) -> Out
where
    First: Copy,
    Second: Copy,
    Out: Copy,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &First>(a), *transmute::<_, &Second>(b));

    transmute(arr)
}

So what we do is convert both the (arbitrarily-sized) input arrays to pointers to constant-size arrays (well, actually to pointer-to-First and pointer-to-Second, but the intent is that First and Second are fixed-size arrays). Then, we dereference them. This is wildly unsafe - there's nothing saying that a.len() is the same as the length of the First type parameter. We put them next to one another in a #[repr(C)] tuple struct - this essentially concatenates them together in memory. Finally, we transmute it to the Out type parameter. If First is [u8; N0] and Second is [u8; N1] then Out should be [u8; N0 + N1]. Why not just use a trait with associated constants? Well, here's an example of what that would look like:

trait ConcatHack {
    const A_LEN: usize;
    const B_LEN: usize;
}

pub const unsafe fn concat<C>(
    a: &[u8],
    b: &[u8],
) -> [u8; C::A_LEN + C::B_LEN]
where
    C: ConcatHack,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &[u8; C::A_LEN]>(a), *transmute::<_, &[u8; C::B_LEN]>(b));

    transmute(arr)
}

This doesn't work though, because type parameters are not respected when calculating fixed-size array lengths. So instead we use individual type parameters for each constant-size array.

Wait, though, if you look at the documentation for std::mem::transmute at the time of writing it's not a const fn. What's going on here then? Well, I wrote my own transmute:

#[allow(unions_with_drop_fields)]
pub const unsafe fn transmute<From, To>(from: From) -> To {
    union Transmute<From, To> {
        from: From,
        to: To,
    }

    Transmute { from }.to
}

This is allowed in a const fn where std::mem::transmute is not. Finally, let's look at the macro itself:

#[macro_export]
macro_rules! const_concat {
    ($a:expr, $b:expr) => {{
        let bytes: &'static [u8] = unsafe {
            &$crate::concat::<
                [u8; $a.len()],
                [u8; $b.len()],
                [u8; $a.len() + $b.len()],
            >($a.as_bytes(), $b.as_bytes())
        };

        unsafe { $crate::transmute::<_, &'static str>(bytes) }
    }};
    ($a:expr, $($rest:expr),*) => {{
        const TAIL: &str = const_concat!($($rest),*);
        const_concat!($a, TAIL)
    }};
}

So first we create a &'static [u8] and then we transmute it to &'static str. This works for now because &[u8] and &str have the same layout, but it's not guaranteed to work forever. The cast to &'static [u8] works even though the right-hand side of that assignment is local to this scope because of something called "rvalue static promotion".

The eagle-eyed among you may have also noticed that &[u8; N] and &[u8] have different sizes, since the latter is a fat pointer. Well my constant transmute doesn't check size (union fields can have different sizes) and for now the layout of both of these types puts the pointer first. There's no way to fix that on the current version of the compiler, since &slice[..] isn't implemented for constant expressions.

This currently doesn't work in trait associated constants. I do have a way to support trait associated constants but again, you can't access type parameters in array lengths so that unfortunately doesn't work. Finally, it requires quite a few nightly features:

#![feature(const_fn, const_str_as_bytes, const_str_len, const_let, untagged_unions)]

UPDATE

I fixed the issue where the transmute relies on the pointer in &[u8] being first by instead transmuting a pointer to the first element of the array. The code now looks like so:

pub const unsafe fn concat<First, Second, Out>(a: &[u8], b: &[u8]) -> Out
where
    First: Copy,
    Second: Copy,
    Out: Copy,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> = Both(
        *transmute::<_, *const First>(a.as_ptr()),
        *transmute::<_, *const Second>(b.as_ptr()),
    );

    transmute(arr)
}

More Repositories

1

goeld

Gรถld: Lรถve for Goldsrc
Rust
120
star
2

crunchy

Crunchy unroller - deterministically unroll constant loops
Rust
30
star
3

zig-kak

Kakoune syntax highlighting for Zig
14
star
4

pollock

An ergonomic and performant processing-like library for generative art and simple games in Rust
Rust
10
star
5

runwasm

Run emscripten-compiled wasm files using `wasmi` and dark magic
Rust
9
star
6

rustfest-perf-workshop

Example code for the Rustfest Fastware workshop
Rust
7
star
7

octahack-rs

A fast, efficient, modular music creation toolkit, designed for live performance
Rust
6
star
8

sexpress

Extremely fast s-expression parser for lisp implementations. It's pretty unwieldy to use right now.
NewLisp
6
star
9

asmquery

An extremely-WIP low-level retargetable instruction selection subsystem for Lightbeam, designed for optimisation and simplicity
Rust
4
star
10

troubles.md

Source to http://troubles.md/
HTML
3
star
11

octahack-zig

VCV/Max-style modular plug-anything-to-anything system, designed to be embedded free-standing in an Octatrack-style physical instrument
Zig
3
star
12

autoproto

Replacement derive macros for `prost::Message`, and supporting traits and types to make implementing this trait easier
Rust
3
star
13

nan-preserving-float

Rust
3
star
14

skottie-rs

Lottie animation player using Skia
Rust
2
star
15

GoldsRS

An uncoordinated mess of goldsrc-era file format parsers (BSP, MDL, etc.)
Rust
2
star
16

cowvec

Faster version of `Cow<[T]>` and `Cow<str>`
Rust
2
star
17

quadratic-decomposition

Vector stroke renderer by decomposing shapes into simple fills and quadratic bezier curves.
Rust
2
star
18

stateful-lua

A DSL for writing finite state machines in Lua (aimed towards game development). Based on UnrealScript's state construct. Allows single inheritance with the enter and exit functions being called the minimum number of times.
Lua
2
star
19

flow-field-pathfinding

Simple implementation of flow field pathfinding built on top of petgraph
Rust
1
star
20

astcenc-sys

Low-level bindings for astc-encoder, the ARM library to create ASTC files.
Rust
1
star
21

madvise-rs

Rustic safe wrapper around madvise
Rust
1
star
22

llvm-euler

Project euler problems in raw, artisinal LLVM IR (copied from my old account)
LLVM
1
star
23

stupid-alloc

A simple, serialisable external allocator designed to reuse space in long-lived mutable files.
Rust
1
star
24

assimp-sys

Forked from Eljay, but that user seems to have deleted all their projects
Rust
1
star
25

serializable-tlsf

Implementation of two-level segregated fit for memory allocation in long-lived files.
Rust
1
star
26

quake-shooter

Quake-like shooter based on amethyst
Rust
1
star
27

astcenc-rs

Idiomatic Rust bindings to the official ASTC encoder library from ARM
Rust
1
star
28

somr

Single-owner, multiple-reader pointer (like `Rc` without the refcounting)
Rust
1
star
29

rustfest-game

Simple multiplayer asteroids game for rustfest
Rust
1
star
30

waveform

Really hacky audio visualiser/player put together for fun/testing
Rust
1
star
31

mdfmt

Simple, extremely opinionated markdown "prettifier". Will probably eat your laundry.
Rust
1
star
32

chat-lua

Proof-of-concept chat eDSL in Lua using coroutines
Lua
1
star
33

ranger-kak

Replacement Ranger integration after it was removed from Kakoune's default bundle
1
star