• Stars
    star
    5,106
  • Rank 8,139 (Top 0.2 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Safe interop between Rust and C++

CXX β€” safe FFI between Rust and C++

github crates.io docs.rs build status

This library provides a safe mechanism for calling C++ code from Rust and Rust code from C++, not subject to the many ways that things can go wrong when using bindgen or cbindgen to generate unsafe C-style bindings.

This doesn't change the fact that 100% of C++ code is unsafe. When auditing a project, you would be on the hook for auditing all the unsafe Rust code and all the C++ code. The core safety claim under this new model is that auditing just the C++ side would be sufficient to catch all problems, i.e. the Rust side can be 100% safe.

[dependencies]
cxx = "1.0"

[build-dependencies]
cxx-build = "1.0"

Compiler support: requires rustc 1.63+ and c++11 or newer
Release notes


Guide

Please see https://cxx.rs for a tutorial, reference material, and example code.


Overview

The idea is that we define the signatures of both sides of our FFI boundary embedded together in one Rust module (the next section shows an example). From this, CXX receives a complete picture of the boundary to perform static analyses against the types and function signatures to uphold both Rust's and C++'s invariants and requirements.

If everything checks out statically, then CXX uses a pair of code generators to emit the relevant extern "C" signatures on both sides together with any necessary static assertions for later in the build process to verify correctness. On the Rust side this code generator is simply an attribute procedural macro. On the C++ side it can be a small Cargo build script if your build is managed by Cargo, or for other build systems like Bazel or Buck we provide a command line tool which generates the header and source file and should be easy to integrate.

The resulting FFI bridge operates at zero or negligible overhead, i.e. no copying, no serialization, no memory allocation, no runtime checks needed.

The FFI signatures are able to use native types from whichever side they please, such as Rust's String or C++'s std::string, Rust's Box or C++'s std::unique_ptr, Rust's Vec or C++'s std::vector, etc in any combination. CXX guarantees an ABI-compatible signature that both sides understand, based on builtin bindings for key standard library types to expose an idiomatic API on those types to the other language. For example when manipulating a C++ string from Rust, its len() method becomes a call of the size() member function defined by C++; when manipulating a Rust string from C++, its size() member function calls Rust's len().


Example

In this example we are writing a Rust application that wishes to take advantage of an existing C++ client for a large-file blobstore service. The blobstore supports a put operation for a discontiguous buffer upload. For example we might be uploading snapshots of a circular buffer which would tend to consist of 2 chunks, or fragments of a file spread across memory for some other reason.

A runnable version of this example is provided under the demo directory of this repo. To try it out, run cargo run from that directory.

#[cxx::bridge]
mod ffi {
    // Any shared structs, whose fields will be visible to both languages.
    struct BlobMetadata {
        size: usize,
        tags: Vec<String>,
    }

    extern "Rust" {
        // Zero or more opaque types which both languages can pass around but
        // only Rust can see the fields.
        type MultiBuf;

        // Functions implemented in Rust.
        fn next_chunk(buf: &mut MultiBuf) -> &[u8];
    }

    unsafe extern "C++" {
        // One or more headers with the matching C++ declarations. Our code
        // generators don't read it but it gets #include'd and used in static
        // assertions to ensure our picture of the FFI boundary is accurate.
        include!("demo/include/blobstore.h");

        // Zero or more opaque types which both languages can pass around but
        // only C++ can see the fields.
        type BlobstoreClient;

        // Functions implemented in C++.
        fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
        fn put(&self, parts: &mut MultiBuf) -> u64;
        fn tag(&self, blobid: u64, tag: &str);
        fn metadata(&self, blobid: u64) -> BlobMetadata;
    }
}

Now we simply provide Rust definitions of all the things in the extern "Rust" block and C++ definitions of all the things in the extern "C++" block, and get to call back and forth safely.

Here are links to the complete set of source files involved in the demo:

To look at the code generated in both languages for the example by the CXX code generators:

   # run Rust code generator and print to stdout
   # (requires https://github.com/dtolnay/cargo-expand)
$ cargo expand --manifest-path demo/Cargo.toml

   # run C++ code generator and print to stdout
$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs

Details

As seen in the example, the language of the FFI boundary involves 3 kinds of items:

  • Shared structs β€” their fields are made visible to both languages. The definition written within cxx::bridge is the single source of truth.

  • Opaque types β€” their fields are secret from the other language. These cannot be passed across the FFI by value but only behind an indirection, such as a reference &, a Rust Box, or a UniquePtr. Can be a type alias for an arbitrarily complicated generic language-specific type depending on your use case.

  • Functions β€” implemented in either language, callable from the other language.

Within the extern "Rust" part of the CXX bridge we list the types and functions for which Rust is the source of truth. These all implicitly refer to the super module, the parent module of the CXX bridge. You can think of the two items listed in the example above as being like use super::MultiBuf and use super::next_chunk except re-exported to C++. The parent module will either contain the definitions directly for simple things, or contain the relevant use statements to bring them into scope from elsewhere.

Within the extern "C++" part, we list types and functions for which C++ is the source of truth, as well as the header(s) that declare those APIs. In the future it's possible that this section could be generated bindgen-style from the headers but for now we need the signatures written out; static assertions will verify that they are accurate.

Your function implementations themselves, whether in C++ or Rust, do not need to be defined as extern "C" ABI or no_mangle. CXX will put in the right shims where necessary to make it all work.


Comparison vs bindgen and cbindgen

Notice that with CXX there is repetition of all the function signatures: they are typed out once where the implementation is defined (in C++ or Rust) and again inside the cxx::bridge module, though compile-time assertions guarantee these are kept in sync. This is different from bindgen and cbindgen where function signatures are typed by a human once and the tool consumes them in one language and emits them in the other language.

This is because CXX fills a somewhat different role. It is a lower level tool than bindgen or cbindgen in a sense; you can think of it as being a replacement for the concept of extern "C" signatures as we know them, rather than a replacement for a bindgen. It would be reasonable to build a higher level bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module (and/or IDL like Thrift) as source of truth and generates the cxx::bridge, eliminating the repetition while leveraging the static analysis safety guarantees of CXX.

But note in other ways CXX is higher level than the bindgens, with rich support for common standard library types. Frequently with bindgen when we are dealing with an idiomatic C++ API we would end up manually wrapping that API in C-style raw pointer functions, applying bindgen to get unsafe raw pointer Rust functions, and replicating the API again to expose those idiomatically in Rust. That's a much worse form of repetition because it is unsafe all the way through.

By using a CXX bridge as the shared understanding between the languages, rather than extern "C" C-style signatures as the shared understanding, common FFI use cases become expressible using 100% safe code.

It would also be reasonable to mix and match, using CXX bridge for the 95% of your FFI that is straightforward and doing the remaining few oddball signatures the old fashioned way with bindgen and cbindgen, if for some reason CXX's static restrictions get in the way. Please file an issue if you end up taking this approach so that we know what ways it would be worthwhile to make the tool more expressive.


Cargo-based setup

For builds that are orchestrated by Cargo, you will use a build script that runs CXX's C++ code generator and compiles the resulting C++ code along with any other C++ code for your crate.

The canonical build script is as follows. The indicated line returns a cc::Build instance (from the usual widely used cc crate) on which you can set up any additional source files and compiler flags as normal.

# Cargo.toml

[build-dependencies]
cxx-build = "1.0"
// build.rs

fn main() {
    cxx_build::bridge("src/main.rs")  // returns a cc::Build
        .file("src/demo.cc")
        .std("c++11")
        .compile("cxxbridge-demo");

    println!("cargo:rerun-if-changed=src/main.rs");
    println!("cargo:rerun-if-changed=src/demo.cc");
    println!("cargo:rerun-if-changed=include/demo.h");
}

Non-Cargo setup

For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of invoking the C++ code generator as a standalone command line tool. The tool is packaged as the cxxbridge-cmd crate on crates.io or can be built from the gen/cmd directory of this repo.

$ cargo install cxxbridge-cmd

$ cxxbridge src/main.rs --header > path/to/mybridge.h
$ cxxbridge src/main.rs > path/to/mybridge.cc

Safety

Be aware that the design of this library is intentionally restrictive and opinionated! It isn't a goal to be powerful enough to handle arbitrary signatures in either language. Instead this project is about carving out a reasonably expressive set of functionality about which we can make useful safety guarantees today and maybe extend over time. You may find that it takes some practice to use CXX bridge effectively as it won't work in all the ways that you are used to.

Some of the considerations that go into ensuring safety are:

  • By design, our paired code generators work together to control both sides of the FFI boundary. Ordinarily in Rust writing your own extern "C" blocks is unsafe because the Rust compiler has no way to know whether the signatures you've written actually match the signatures implemented in the other language. With CXX we achieve that visibility and know what's on the other side.

  • Our static analysis detects and prevents passing types by value that shouldn't be passed by value from C++ to Rust, for example because they may contain internal pointers that would be screwed up by Rust's move behavior.

  • To many people's surprise, it is possible to have a struct in Rust and a struct in C++ with exactly the same layout / fields / alignment / everything, and still not the same ABI when passed by value. This is a longstanding bindgen bug that leads to segfaults in absolutely correct-looking code (rust-lang/rust-bindgen#778). CXX knows about this and can insert the necessary zero-cost workaround transparently where needed, so go ahead and pass your structs by value without worries. This is made possible by owning both sides of the boundary rather than just one.

  • Template instantiations: for example in order to expose a UniquePtr<T> type in Rust backed by a real C++ unique_ptr, we have a way of using a Rust trait to connect the behavior back to the template instantiations performed by the other language.


Builtin types

In addition to all the primitive types (i32 <=> int32_t), the following common types may be used in the fields of shared structs and the arguments and returns of functions.

name in Rustname in C++restrictions
Stringrust::String
&strrust::Str
&[T]rust::Slice<const T>cannot hold opaque C++ type
&mut [T]rust::Slice<T>cannot hold opaque C++ type
CxxStringstd::stringcannot be passed by value
Box<T>rust::Box<T>cannot hold opaque C++ type
UniquePtr<T>std::unique_ptr<T>cannot hold opaque Rust type
SharedPtr<T>std::shared_ptr<T>cannot hold opaque Rust type
[T; N]std::array<T, N>cannot hold opaque C++ type
Vec<T>rust::Vec<T>cannot hold opaque C++ type
CxxVector<T>std::vector<T>cannot be passed by value, cannot hold opaque Rust type
*mut T, *const TT*, const T*fn with a raw pointer argument must be declared unsafe to call
fn(T, U) -> Vrust::Fn<V(T, U)>only passing from Rust to C++ is implemented so far
Result<T>throw/catchallowed as return type only

The C++ API of the rust namespace is defined by the include/cxx.h file in this repo. You will need to include this header in your C++ code when working with those types.

The following types are intended to be supported "soon" but are just not implemented yet. I don't expect any of these to be hard to make work but it's a matter of designing a nice API for each in its non-native language.

name in Rustname in C++
BTreeMap<K, V>tbd
HashMap<K, V>tbd
Arc<T>tbd
Option<T>tbd
tbdstd::map<K, V>
tbdstd::unordered_map<K, V>

Remaining work

This is still early days for CXX; I am releasing it as a minimum viable product to collect feedback on the direction and invite collaborators. Please check the open issues.

Especially please report issues if you run into trouble building or linking any of this stuff. I'm sure there are ways to make the build aspects friendlier or more robust.

Finally, I know more about Rust library design than C++ library design so I would appreciate help making the C++ APIs in this project more idiomatic where anyone has suggestions.


License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

More Repositories

1

anyhow

Flexible concrete Error type built on std::error::Error
Rust
4,193
star
2

thiserror

derive(Error) for struct and enum error types
Rust
3,352
star
3

proc-macro-workshop

Learn to write Rust procedural macros  [Rust Latam conference, Montevideo Uruguay, March 2019]
Rust
2,988
star
4

syn

Parser for Rust source code
Rust
2,681
star
5

cargo-expand

Subcommand to show result of macro expansion
Rust
2,644
star
6

async-trait

Type erasure for async trait methods
Rust
1,495
star
7

case-studies

Analysis of various tricky Rust code
Rust
1,340
star
8

rust-quiz

Medium to hard Rust questions with explanations
Rust
1,318
star
9

quote

Rust quasi-quoting
Rust
1,231
star
10

watt

Runtime for executing procedural macros as WebAssembly
Rust
1,062
star
11

typetag

Serde serializable and deserializable trait objects
Rust
888
star
12

paste

Macros for all your token pasting needs
Rust
852
star
13

serde-yaml

Strongly typed YAML library for Rust
Rust
804
star
14

no-panic

Attribute macro to require that the compiler prove a function can't ever panic
Rust
758
star
15

inventory

Typed distributed plugin registration
Rust
714
star
16

rust-toolchain

Concise GitHub Action for installing a Rust toolchain
Shell
621
star
17

trybuild

Test harness for ui tests of compiler diagnostics
Rust
615
star
18

miniserde

Data structure serialization library with several opposite design goals from Serde
Rust
612
star
19

reflect

Compile-time reflection API for developing robust procedural macros (proof of concept)
Rust
602
star
20

request-for-implementation

Crates that don't exist, but should
597
star
21

proc-macro2

Rust
545
star
22

indoc

Indented document literals for Rust
Rust
537
star
23

prettyplease

A minimal `syn` syntax tree pretty-printer
Rust
517
star
24

erased-serde

Type-erased Serialize, Serializer and Deserializer traits
Rust
503
star
25

semver

Parser and evaluator for Cargo's flavor of Semantic Versioning
Rust
500
star
26

dyn-clone

Clone trait that is object-safe
Rust
486
star
27

ryu

Fast floating point to string conversion
Rust
471
star
28

linkme

Safe cross-platform linker shenanigans
Rust
399
star
29

cargo-llvm-lines

Count lines of LLVM IR per generic function
Rust
398
star
30

semver-trick

How to avoid complicated coordinated upgrades
Rust
383
star
31

efg

Conditional compilation using boolean expression syntax, rather than any(), all(), not()
Rust
297
star
32

rust-faq

Frequently Asked Questions Β· The Rust Programming Language
262
star
33

rustversion

Conditional compilation according to rustc compiler version
Rust
256
star
34

itoa

Fast function for printing integer primitives to a decimal string
Rust
248
star
35

path-to-error

Find out path at which a deserialization error occurred
Rust
241
star
36

cargo-tally

Graph the number of crates that depend on your crate over time
Rust
212
star
37

proc-macro-hack

Procedural macros in expression position
Rust
203
star
38

monostate

Type that deserializes only from one specific value
Rust
194
star
39

colorous

Color schemes for charts and maps
Rust
193
star
40

readonly

Struct fields that are made read-only accessible to other modules
Rust
187
star
41

dissimilar

Diff library with semantic cleanup, based on Google's diff-match-patch
Rust
175
star
42

star-history

Graph history of GitHub stars of a user or repo over time
Rust
156
star
43

ref-cast

Safely cast &T to &U where the struct U contains a single field of type T.
Rust
154
star
44

automod

Pull in every source file in a directory as a module
Rust
129
star
45

inherent

Make trait methods callable without the trait in scope
Rust
128
star
46

ghost

Define your own PhantomData
Rust
115
star
47

faketty

Wrapper to exec a command in a pty, even if redirecting the output
Rust
113
star
48

dtoa

Fast functions for printing floating-point primitives to a decimal string
Rust
110
star
49

clang-ast

Rust
108
star
50

seq-macro

Macro to repeat sequentially indexed copies of a fragment of code
Rust
102
star
51

remain

Compile-time checks that an enum or match is written in sorted order
Rust
99
star
52

mashup

Concatenate identifiers in a macro invocation
Rust
96
star
53

noisy-clippy

Rust
84
star
54

tt-call

Token tree calling convention
Rust
77
star
55

basic-toml

Minimal TOML library with few dependencies
Rust
76
star
56

squatternaut

A snapshot of name squatting on crates.io
Rust
73
star
57

serde-ignored

Find out about keys that are ignored when deserializing data
Rust
68
star
58

enumn

Convert number to enum
Rust
66
star
59

bootstrap

Bootstrapping rustc from source
Shell
62
star
60

essay

docs.rs as a publishing platform?
Rust
62
star
61

db-dump

Library for scripting analyses against crates.io's database dumps
Rust
60
star
62

scratch

Compile-time temporary directory shared by multiple crates and erased by `cargo clean`
Rust
59
star
63

gflags

Command line flags library that does not require a central list of all the flags
Rust
55
star
64

install

Fast `cargo install` action using a GitHub-based binary cache
Shell
55
star
65

serde-starlark

Serde serializer for generating Starlark build targets
Rust
53
star
66

oqueue

Non-interleaving multithreaded output queue
Rust
53
star
67

build-alert

Rust
51
star
68

unicode-ident

Determine whether characters have the XID_Start or XID_Continue properties
Rust
51
star
69

lalrproc

Proof of concept of procedural macro input parsed by LALRPOP
Rust
50
star
70

dragonbox

Rust
50
star
71

sha1dir

Checksum of a directory tree
Rust
38
star
72

hackfn

Fake implementation of `std::ops::Fn` for user-defined data types
Rust
38
star
73

reduce

iter.reduce(fn) in Rust
Rust
37
star
74

link-cplusplus

Link libstdc++ or libc++ automatically or manually
Rust
36
star
75

argv

Non-allocating iterator over command line arguments
Rust
33
star
76

get-all-crates

Download .crate files of all versions of all crates from crates.io
Rust
31
star
77

threadbound

Make any value Sync but only available on its original thread
Rust
31
star
78

dircnt

Count directory entriesβ€”`ls | wc -l` but faster
Rust
27
star
79

unsafe-libyaml

libyaml transpiled to rust by c2rust
Rust
27
star
80

serde-stacker

Serializer and Deserializer adapters that avoid stack overflows by dynamically growing the stack
Rust
27
star
81

cargo-unlock

Remove Cargo.lock lockfile
Rust
25
star
82

respan

Macros to erase scope information from tokens
Rust
24
star
83

isatty

libc::isatty that also works on Windows
Rust
21
star
84

iota

Related constants in Rust: 1 << iota
Rust
20
star
85

foreach

18
star
86

bufsize

bytes::BufMut implementation to count buffer size
Rust
18
star
87

hire

How to hire dtolnay
18
star
88

precise

Full precision decimal representation of f64
Rust
17
star
89

dashboard

15
star
90

rustflags

Parser for CARGO_ENCODED_RUSTFLAGS
Rust
13
star
91

libfyaml-rs

Rust binding for libfyaml
Rust
11
star
92

install-buck2

Install precompiled Buck2 build system
6
star
93

mailingset

Set-algebraic operations on mailing lists
Python
5
star
94

.github

5
star
95

jq-gdb

gdb pretty-printer for jv objects
Python
1
star