pb-jelly
pb-jelly
is a protobuf code generation framework for the Rust language developed at Dropbox.
History
This implementation was initially written in 2016 to satisfy the need of shuffling large amount
of bytes in Dropbox's Storage System (Magic Pocket).
Previously, we were using rust-protobuf
(and therefore generated APIs are exactly
the same to make migration easy) but serializing Rust structs to proto messages, and then serializing them again in
our RPC layer, meant multiple copies (and same thing in reverse on parsing stack). Taking control of this
implementation and integrating it in our RPC stack end-to-end helped avoid these extra copies.
Over the years, the implementation has grown and matured and is currently used in several parts of Dropbox, including our Sync Engine, and the aforementioned Magic Pocket.
Other implementations exist in the Rust ecosystem (e.g. prost
and rust-protobuf
), we wanted to share ours as well.
Features
- Functional "Rust-minded" proto extensions, e.g.
[(rust.box_it)=true]
- Scalable - Generates separate crates per module, with option for crate-per-directory
- Autogenerates
Cargo.toml
, or optionallySpec.toml
/ bazelBUILD
files
- Autogenerates
- Support for
Serde
- Zero-copy deserialization with
Bytes
via a proto extension[(rust.zero_copy)=true]
- Automatically boxes messages if it finds a recursive message definition
- Retains comments on proto fields
- Supports
proto2
andproto3
Extensions
Extension | Description | Type | Example |
---|---|---|---|
(rust.zero_copy)=true |
Generates field type of Lazy<bytes::Bytes> for proto bytes fields to support zero-copy deserialization |
Field | zero_copy |
(rust.box_it)=true |
Generates a Box<Message> field type |
Field | box_it |
(rust.type)="type" |
Generates a custom field type | Field | custom_type |
(rust.preserve_unrecognized)=true |
Preserves unrecognized proto fields into an _unrecognized struct field |
Field | TODO |
(rust.nullable_field)=false |
Generates non-nullable fields types | Field | TODO |
(rust.nullable)=false |
Generates oneofs as non-nullable (fail on deserialization) | Oneof | non_optional |
(rust.err_if_default_or_unknown)=true |
Generates enums as non-zeroable (fail on deserialization) | Enum | non_optional |
(rust.closed_enum)=true |
Generates only a "closed" enum which will fail deserialization for unknown values, but is easier to work with in Rust | Enum | TODO |
(rust.serde_derive)=true |
Generates serde serializable/deserializable messages | File | serde |
pb-jelly
in your project
Using Multiple crates, multiple languages, my oh my!
Essential Crates
There are only two crates you'll need if you want to use this with you project pb-jelly
and pb-jelly-gen
.
pb-jelly
Contains all of the important traits and structs that power our generated code, e.g. Message
and Lazy
. Include this as a dependency
, e.g.
[dependencies]
pb-jelly = "0.0.12"
pb-jelly-gen
A framework for generating Rust structs and implementations for proto2
and proto3
files.
In order to use pb-jelly, you need to add the pb-jelly-gen/codegen/codegen.py as a plugin to your protoc invocation.
We added some code here to handle the protoc invocation if you choose to use it.
You'll need to add a generation crate (see examples_gen
for an example)
Include pb-jelly-gen
as a dependency of your generation crate, and cargo run
to invoke protoc for you.
[dependencies]
pb-jelly-gen = "0.0.12"
Eventually, we hope to eliminate the need for a generation crate, and simply have generation occur
inside a build.rs with pb-jelly-gen
as a build dependency. However rust-lang/cargo#8709
must be resolved first.
Note that you can always invoke protoc on your own (for example if you are already doing so to generate for multiple languages)
with --rust_out=codegen.py
as a plugin for rust.
Generating Rust Code
- Install
protoc
- The protobuf compiler, this can be downloaded or built from sourceprotobuf
or installed (mac) viabrew install protobuf
. python3
- The codegen plugin used withprotoc
is written in Python3.
To generate with pb-jelly-gen
- Create an inner (build-step) crate which depends on pb-jelly-gen. Example
cargo run
in the directory of the inner generation crate
To generate manually with protoc
- Create venv [optional]
python3 -m venv .pb_jelly_venv ; source .pb_jelly_venv/bin/activate
- [Recommended]
python3 -m pip install protobuf==[same_version_as_your protoc]
- Install
python3 -m pip install -e pb-jelly-gen/codegen
(installs protoc-gen-rust into the venv) protoc --rust_out=generated/ input.proto
Example
Take a look at the examples
crate to see how we leverage pb-jelly-gen
and build.rs
to get started using protobufs in Rust!
Non-essential Crates
pb-test
contains integration tests and benchmarks. You don't need to worry about this one unless you want to contribute to this repository!examples
contains some examples to help you get started
๐
A Note On Scalability We mention "scalabilty" as a feature, what does that mean? We take an opinionated stance that every module should be a crate, as opposed to generating Rust files 1:1 with proto files. We take this stance because rustc
is parallel across crates, but not yet totally parallel within a crate. When we had all of our generated Rust code in a single crate, it was often that single crate that took the longest to compile. The solution to these long compile times, was creating many crates!
The Name
pb-jelly is a shoutout to the jellyfish known for its highly efficient locomotion. This library is capable of highly efficient locomotion of deserialized data. Also a shoutout to ability of the jellyfish to have substantial increases in population. This library handles generating a very large number of proto modules with complex dependencies, by generating to multiple crates.
We also like the popular sandwich.
Contributing
First, contributions are greatly appreciated and highly encouraged. For legal reasons all outside contributors must agree to Dropbox's CLA. Thank you for your understanding.
Upcoming
Some of the features here require additional tooling to be useful, which are not yet public.
- Spec.toml is a stripped down templated Cargo.toml - which you can script convert into Cargo.toml in order to get consistent dependency versions in a multi-crate project. Currently, the script to convert Spec.toml -> Cargo.toml isn't yet available
- Autogenerated BUILD files require additional tooling to convert
BUILD.in-gen-proto~
to a BUILD file
Closed structs with public fields
- Adding fields to a proto file will lead to compiler errors. This can be a benefit in that it allows the compiler to identify all callsites that may need to be visited. However, it can make updating protos with many callsites a bit tedious. We opted to go this route to make it easier to add a new field and update all callsites with assistance from the compiler.
Service Generation
- Generating stubs for gPRC clients and servers
pbtest
unit tests
Running the - Clone Repo.
- Install Dependencies / Testing Dependencies. Use the appropriate package manager for your system.
- protoc - part of Google's protobuf tools
- macos:
brew install protobuf
- Linux (Fedora/CentOS/RHEL):
dnf install protobuf protobuf-devel
- macos:
- Install Python
- [if necessary] macos:
brew install python3
- [if necessary] macos:
- protoc - part of Google's protobuf tools
- pb-jelly currently uses an experimental test framework that requires a nightly build of rust.
rustup default nightly
cd pb-test
( cd pb_test_gen ; cargo run ) ; cargo test
Contributors
Dropboxers [incl former]
- @nipunn1313
- @rajatgoel
- @ParkMyCar
- @rbtying
- @goffrie
- @euroelessar
- @bradenaw
- @glaysche2
- @jiayixu
- @dyv
- @joshuawarner32
- @peterlvilim
- @ddeville
- @isho
- @benjaminp
- @grahamking
Non-Dropbox
Similar Projects
rust-protobuf
- Rust implementation of Google protocol buffers
prost
- PROST! a Protocol Buffers implementation for the Rust Language
quick-protobuf
- A rust implementation of protobuf parser
serde-protobuf