• Stars
    star
    156
  • Rank 239,589 (Top 5 %)
  • Language
    Rust
  • License
    Other
  • Created over 3 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fantastic serialization library

Alkahest - Fantastic serialization library.

crates docs actions MIT/Apache loc

Alkahest is blazing-fast, zero-deps, zero-overhead, zero-unsafe, schema-based serialization library. It is suitable for broad range of use-cases, but tailored for custom high-performance network protocols.

Benchmarks

This benchmark that mimics some game networking protocol.

alkahest bincode rkyv speedy
serialize 10.69 us (✅ 1.00x) 11.08 us (✅ 1.04x slower) 12.43 us (❌ 1.16x slower) 11.13 us (✅ 1.04x slower)
read 1.19 us (✅ 1.00x) 9.19 us (❌ 7.74x slower) 2.10 us (❌ 1.77x slower) 1.54 us (❌ 1.30x slower)

Made with criterion-table

See also benchmark results from https://github.com/djkoloski/rust_serialization_benchmark (in draft until 0.2 release).

Features

  • Schema-based serialization. Alkahest uses data schemas called Formulas to serialize and deserialize data. Thus controlling data layout independently from data types that are serialized or deserialized.

  • Support wide variety of formulas. Integers, floats, booleans, tuples, arrays, slices, strings and user-defined formulas with custom data layout using derive macro that works for structs and enums of any complexity and supports generics.

  • Zero-overhead serialization of sequences. Alkahest support serializing iterators directly into slice formulas. No more allocation of a Vec to serialize and drop immediately.

  • Lazy deserialization. Alkahest provides Lazy<F> type to deserialize any formula F lazily. Lazy can be used later to perform actual deserialization.
    Lazy<[F]> can also produce iterator that deserializes elements on demand.
    Laziness is controlled on type level and can be applied to any element of a larger formula.

  • Infallible serialization. Given large enough or growing buffer any value that implements Serialize can be serialized without error. No more unnecessary unwraps or puzzles "what to do if serialization fails?". The only error condition for serialization is "data doesn't fit".

Planned features

  • Serializable formula descriptors
  • Compatibility rules
  • External tool for code-generation for formula descriptors for C and Rust.

How it works. In more details

Alkahest separates data schema definition (aka Formula) from serialization and deserialization code. Doing so, this library provides better guarantees for cases when serializable data type and deserializable data type are different. It also supports serializing from iterators instead of collections and deserialization into lazy wrappers that defers costly process and may omit it entirely if value is never accessed. User controls laziness on type level by choosing appropriate Deserialize impls. For instance deserializing into Vec<T> is eager because Vec<T> is constructed with all T instances and memory allocated for them. While alkahest::SliceIter implements Iterator and deserializes elements in Iterator::next and other methods. And provides constant-time random access to any element.

Flexibility comes at cost of using only byte slices for serialization and deserialization. And larger footprint of serialized data than some other binary formats.

Question about support of dense data packing is open. It may be desireable to control on type level.

Errors and panics

The API is designed with following principles: Any value can be serialized successfully given large enough buffer. Data can't cause panic, incorrect implementation of a trait can.

There is zero unsafe code in the library on any code it generates. No UB is possible given that std is not unsound.

Forward and backward compatibility

No data schemas stays the same. New fields and variants are added, others are deprecated and removed.

There's set of rules that ensures forward compatibility between formulas. And another set or rules for backward compatibility.

Verification of compatibility is not implemented yet.

Forward compatibility

Forward compatibility is an ability to deserialize data that was serialized with newer formulas.

TODO: List all rules

Backward compatibility

Backward compatibility is an ability to deserialize data that was serialized with older formulas.

TODO: List all rules

Formula, Serialize and Deserialize traits.

The crate works using three fundamental traits. Formula, Serialize and Deserialize. There's also supporting trait - BareFormula.

Alkahest provides proc-macro alkahest for deriving Formula, Serialize and Deserialize.

Formula

Formula trait is used to allow types to serve as data schemas. Any value serialized with given formula should be deserializable with the same formula. Sharing only Formula type allows modules and crates easily communicate. Formula dictates binary data layout and it must be platform-independent.

Potentially Formula types can be generated from separate files, opening possibility for cross-language communication.

Formula is implemented for a number of types out-of-the-box. Primitive types like bool, integers and floating point types all implement Formula. !Caveat!: Serialized size of isize and usize is controlled by a feature-flag. Sizes and addresses are serialized as usize. Truncating usize value if it was too large. This may result in broken data generated and panic in debug. It is also implemented for tuples, array and slice, Option and Vec (the later requires "alloc" feature).

The easiest way to define a new formula is to derive Formula trait for a struct or an enum. Generics are supported, but may require complex bounds specified in attributes for Serialize and Deserialize derive macros. The only constrain is that all fields must implement Formula.

Serialize

Serialize<Formula> trait is used to implement serialization according to a specific formula. Serialization writes to mutable bytes slice and should not perform dynamic allocations. Binary result of any type serialized with a formula must follow it. At the end, if a stream of primitives serialized is the same, binary result should be the same. Types may be serializable with different formulas producing different binary result.

Serialize is implemented for many types. Most notably there's implementation T: Serialize<T> and &T: Serialize<T> for all primitives T (except usize and isize). Another important implementation is Serialize<F> for I where I: IntoIterator, I::Item: Serialize<F>, allowing serializing into slice directly from both iterators and collections. Serialization with formula Ref<F> uses serialization with formula F and then stores relative address and size. No dynamic allocations is required.

Deriving Serialize for a type will generate Serialize implementation, formula is specified in attribute #[alkahest(FormulaRef)] or #[alkahest(serialize(FormulaRef))]. FormulaRef is typically a type. When generics are used it also contains generic parameters and bounds. If formula is not specified - Self is assumed. Formula should be derived for the type as well. It is in-advised to derive Serialize for formulas with manual Formula implementation, Serialize derive macro generates code that uses non-public items generated by Formula derive macro. So either both should have manual implementation or both derived.

For structures Serialize derive macro requires that all fields are present on both Serialize and Formula structure and has the same order (trivially if this is the same structure).

For enums Serialize derive macro checks that for each variant there exists variant on Formula enum. Variants content is compared similar to structs. Serialization inserts variant ID and serializes variant as struct. The size of variants may vary. Padding is inserted by outer value serialization if necessary.

Serialize can be derived for structure where Formula is an enum. In this case variant should be specified using #[alkahest(@variant_ident)] or #[alkahest(serialize(@variant_ident))] and then Serialize derive macro will produce serialization code that works as if this variant was a struct Formula, except that variant's ID will be serialized before fields.

Serialize can be derived for enum only if Formula is enum as well. Serializable enum may omit some (or all) variants from Formula. It may not have variants missing in Formula. Each variant then follows rules for structures.

For convenience Infallible implements Serialize for enum formulas.

Deserialize

Deserialize<'de, Formula> trait is used to implement deserialization according to a specific formula. Deserialization reads from bytes slice constructs deserialized value. Deserialization should not perform dynamic allocations except those that required to construct and initialize deserialized value. E.g. it is allowed to allocate when Vec<T> is produced if non-zero number of T values are deserialized. It should not over-allocate.

Similar to Serialize alkahest provides a number of out-of-the-box implementations of Deserialize trait. From<T> types can be deserialized with primitive formula T.

Values that can be deserialized with formula F can also deserialize with Ref<F>, it reads address and length and proceeds with formula F.

Vec<T> may deserialize with slice formula. Deserialize<'de, [F]> is implemented for alkahest::SliceIter<'de, T> type that implements Iterator and lazily deserialize elements of type T: Deserialize<'de, F>. SliceIter is cloneable, can be iterated from both ends and skips elements for in constant time. For convenience SliceIter also deserializes with array formula.

Deriving Deserialize for a type will generate Deserialize implementation, formula is specified in attribute #[alkahest(FormulaRef)] or #[alkahest(deserialize(FormulaRef))]. FormulaRef is typically a type. When generics are used it also contains generic parameters and bounds. If formula is not specified - Self is assumed. Formula should be derived for the type as well. It is in-advised to derive Deserialize for formulas with manual Formula implementation, Deserialize derive macro generates code that uses non-public items generated by Formula derive macro. So either both should have manual implementation or both derived.

Interoperability with serde

Alkahest is cool but serde is almost universally used, and for good reasons. While designing a Formula it may be desireable to include existing type that supports serialization serde, especially if it comes from another crate. This crate provides Bincode and Bincoded<T> formulas to cover this. Anything with serde::Serialize implementation can be serialized with Bincode formula, naturally it will be serialized using bincode crate. Bincoded<T> is a restricted version of Bincode that works only for T.

Usage example

// This requires two default features - "alloc" and "derive".
#[cfg(all(feature = "derive", feature = "alloc"))]
fn main() {
  use alkahest::{alkahest, serialize_to_vec, deserialize};

  // Define simple formula. Make it self-serializable.
  #[derive(Clone, Debug, PartialEq, Eq)]
  #[alkahest(Formula, SerializeRef, Deserialize)]
  struct MyDataType {
    a: u32,
    b: Vec<u8>,
  }

  // Prepare data to serialize.
  let value = MyDataType {
    a: 1,
    b: vec![2, 3],
  };

  // Use infallible serialization to `Vec`.
  let mut data = Vec::new();

  // Note that this value can be serialized by reference.
  // This is default behavior for `Serialized` derive macro.
  // Some types required ownership transfer for serialization.
  // Notable example is iterators.
  let (size, _) = serialize_to_vec::<MyDataType, _>(&value, &mut data);

  let de = deserialize::<MyDataType, MyDataType>(&data[..size]).unwrap();
  assert_eq!(de, value);
}

#[cfg(not(all(feature = "derive", feature = "alloc")))]
fn main() {}

Benchmarking

Alkahest comes with a benchmark to test against other popular serialization crates. Simply run cargo bench --all-features to see results.

License

Licensed under either of

at your option.

Contributions

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

More Repositories

1

egui-snarl

Customizable egui node-graph library
Rust
240
star
2

blink-alloc

Fast, concurrent, arena-based allocator with drop support
Rust
115
star
3

edict

Rust
92
star
4

rapid-qoi

Fast implementation of QOI format in Rust
Rust
92
star
5

gpu-alloc

Implementation agnostic memory allocator for Vulkan-like APIs
Rust
85
star
6

scoped-arena

Arena allocator with scopes
Rust
43
star
7

xfg-rs

eXtensible Framegraph
Rust
38
star
8

allocator-api2

Mirror of Rust's allocator api for use on stable rust
Rust
37
star
9

wilds

A game with custom engine
Rust
35
star
10

gpu-descriptor

Backend agnostic descriptor allocator for Vulkan-like APIs
Rust
31
star
11

arcana

There's nothing here
Rust
20
star
12

tiny-fn

Rust
13
star
13

egui-probe

Value probing with egui based UI
Rust
13
star
14

gfx-chain

Define dependency chains and synchronize gpu resources like if it is an easy task.
Rust
11
star
15

proc-easy

Macros to make writing proc-macro crates easy
Rust
9
star
16

gfx-mesh

Helper crate for gfx-hal to create and use meshes with vertex semantic
Rust
9
star
17

atomicell

Multi-threaded RefCell on atomics
Rust
7
star
18

serde-nothing

Serialize to nothing. Deserialize from nothing
Rust
7
star
19

safe-bytes

Simple crate to allow reading bytes representation of structures soundly.
Rust
5
star
20

array-fu

Array comprehension
Rust
5
star
21

hetseq

Traits and types to work with heterogenous sequences in rust
Rust
5
star
22

bitsetium

One stop shop for all bitset needs
Rust
5
star
23

gametime

Time calculations oriented for games
Rust
5
star
24

alex

Rust
5
star
25

mev

GAPI
Rust
5
star
26

layered-bitset

Layered bitset implementation
Rust
4
star
27

ring-alloc

Ring-based allocator for Rust
Rust
4
star
28

amity

Concurrency algorithms
Rust
3
star
29

tany

Tiny Any for Rust
Rust
3
star
30

maybe-sync

Crate with type aliases to create maybe-sync API
Rust
3
star
31

reliquary

Simple asset pipeline
Rust
2
star
32

sigils

Experimental ECS
2
star
33

relevant

A small utility type to emulate must-use types
Rust
2
star
34

meme-id

Map IDs to phrases and vice versa
Rust
2
star
35

argosy

Asset management pipeline
Rust
2
star
36

gfx-texture

Helper crate for gfx-hal to create and use textures
Rust
2
star
37

read-cell

Read-only Cell counterpart
Rust
2
star
38

figa

Layered configuration library
Rust
2
star
39

avi-rs

AVI parser implementation in rust
Rust
2
star
40

asset

Asset management with support for gfx-hal
Rust
1
star
41

rendy-examples

Github pages to host rendy examples in wasm32
JavaScript
1
star
42

rwcell-rs

Rust
1
star
43

egui-any

Dynamic schema and value ediable with egui
Rust
1
star
44

veclist

This library contains single simple collection. It can be used to push, access and pop with `O(1)` complexity
Rust
1
star
45

ikarus

Inverse Kinematic crate
Rust
1
star
46

share

Share-by-copy and Share-by-reference abstraction for rust
Rust
1
star