• Stars
    star
    294
  • Rank 141,303 (Top 3 %)
  • Language Starlark
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

LLVM toolchain for bazel

LLVM toolchain for Bazel Tests


The project is in a relatively stable state and in use for all code development at GRAIL and other organizations. Having said that, I am unable to give time to it at any regular cadence.

I rely on the community for maintenance and new feature implementations. If you are interested in being part of this project, please let me know and I can give you write access, so you can merge your changes directly.

If you feel like you have a better maintained fork or an alternative/derived implementation, please let me know and I can redirect people there.

– @siddharthab


Quickstart

Minimum bazel version: 4.2.1

To use this toolchain, include this section in your WORKSPACE:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

BAZEL_TOOLCHAIN_TAG = "0.8.2"
BAZEL_TOOLCHAIN_SHA = "0fc3a2b0c9c929920f4bed8f2b446a8274cad41f5ee823fd3faa0d7641f20db0"

http_archive(
    name = "com_grail_bazel_toolchain",
    sha256 = BAZEL_TOOLCHAIN_SHA,
    strip_prefix = "bazel-toolchain-{tag}".format(tag = BAZEL_TOOLCHAIN_TAG),
    canonical_id = BAZEL_TOOLCHAIN_TAG,
    url = "https://github.com/grailbio/bazel-toolchain/archive/refs/tags/{tag}.tar.gz".format(tag = BAZEL_TOOLCHAIN_TAG),
)

load("@com_grail_bazel_toolchain//toolchain:deps.bzl", "bazel_toolchain_dependencies")

bazel_toolchain_dependencies()

load("@com_grail_bazel_toolchain//toolchain:rules.bzl", "llvm_toolchain")

llvm_toolchain(
    name = "llvm_toolchain",
    llvm_version = "16.0.0",
)

load("@llvm_toolchain//:toolchains.bzl", "llvm_register_toolchains")

llvm_register_toolchains()

And add the following section to your .bazelrc file (not needed after this issue is closed):

build --incompatible_enable_cc_toolchain_resolution

Basic Usage

The toolchain can automatically detect your OS and arch type, and use the right pre-built binary LLVM distribution. See the section on "Bring Your Own LLVM" below for more options.

See in-code documentation in rules.bzl for available attributes to llvm_toolchain.

Advanced Usage

Per host architecture LLVM version

LLVM does not come with distributions for all host architectures in each version. In particular patch versions often come with few prebuilt packages. This means that a single version probably is not enough to address all hosts one wants to support.

This can be solved by providing a target/version map with a default version. The example below selects 15.0.6 as the default version for all targets not specified explicitly. This is like providing llvm_version = "15.0.6", just like in the example on the top. However, here we provide two more entries that map their respective target to a distinct version:

llvm_toolchain(
    name = "llvm_toolchain",
    llvm_versions = {
        "": "15.0.6",
        "darwin-aarch64": "15.0.7",
        "darwin-x86_64": "15.0.7",
    },
)

Customizations

We currently offer limited customizability through attributes of the llvm_toolchain_* rules. You can send us a PR to add more configuration attributes.

A majority of the complexity of this project is to make it generic for multiple use cases. For one-off experiments with new architectures, cross-compilations, new compiler features, etc., my advice would be to look at the toolchain configurations generated by this repo, and copy-paste/edit to make your own in any package in your own workspace.

bazel query --output=build @llvm_toolchain//:all | grep -v -e '^#' -e '^  generator'

Besides defining your toolchain in your package BUILD file, and until this issue is resolved, you would also need a way for bazel to access the tools in LLVM distribution as relative paths from your package without using .. up-references. For this, you can create a symlink that uses up-references to point to the LLVM distribution directory, and also create a wrapper script for clang such that the actual clang invocation is not through the symlinked path. See the files in the @llvm_toolchain//: package as a reference.

# See generated files for reference.
ls -lR "$(bazel info output_base)/external/llvm_toolchain"

# Create symlink to LLVM distribution.
cd _your_package_directory_
ln -s ../....../external/llvm_toolchain_llvm llvm

# Create CC wrapper script.
mkdir bin
cp "$(bazel info output_base)/external/llvm_toolchain/bin/cc_wrapper.sh" bin/cc_wrapper.sh
vim bin/cc_wrapper.sh # Review to ensure relative paths, etc. are good.

See bazel tutorial for how CC toolchains work in general.

Selecting Toolchains

If toolchains are registered (see Quickstart section above), you do not need to do anything special for bazel to find the toolchain. You may want to check once with the --toolchain_resolution_debug flag to see which toolchains were selected by bazel for your target platform.

For specifying unregistered toolchains on the command line, please use the --extra_toolchains flag. For example, --extra_toolchains=@llvm_toolchain//:cc-toolchain-x86_64-linux.

We no longer support the --crosstool_top=@llvm_toolchain//:toolchain flag, and instead rely on the --incompatible_enable_cc_toolchain_resolution flag.

Bring Your Own LLVM

The following mechanisms are available for using an LLVM toolchain:

  1. Host OS information is used to find the right pre-built binary distribution from llvm.org, given the llvm_version or llvm_versions attribute. The LLVM toolchain archive is downloaded and extracted as a separate repository with the suffix _llvm. The detection logic for llvm_version is not perfect, so you may have to use llvm_versions for some host OS type and versions. We expect the detection logic to grow through community contributions. We welcome PRs.
  2. You can use the urls attribute to specify your own URLs for each OS type, version and architecture. For example, you can specify a different URL for Arch Linux and a different one for Ubuntu. Just as with the option above, the archive is downloaded and extracted as a separate repository with the suffix _llvm.
  3. You can also specify your own bazel package paths or local absolute paths for each host os-arch pair through the toolchain_roots attribute. Note that the keys here are different and less granular than the keys in the urls attribute. When using a bazel package path, each of the values is typically a package in the user's workspace or configured through local_repository or http_archive; the BUILD file of the package should be similar to @com_grail_bazel_toolchain//toolchain:BUILD.llvm_repo. If using only http_archive, maybe consider using the urls attribute instead to get more flexibility if you need.
  4. All the above options rely on host OS information, and are not suited for docker based sandboxed builds or remote execution builds. Such builds will need a single distribution version specified through the distribution attribute, or URLs specified through the urls attribute with an empty key, or a toolchain root specified through the toolchain_roots attribute with an empty key.

Sysroots

A sysroot can be specified through the sysroot attribute. This can be either a path on the user's system, or a bazel filegroup like label. One way to create a sysroot is to use docker export to get a single archive of the entire filesystem for the image you want. Another way is to use the build scripts provided by the Chromium project.

Cross-compilation

The toolchain supports cross-compilation if you bring your own sysroot. When cross-compiling, we link against the libstdc++ from the sysroot (single-platform build behavior is to link against libc++ bundled with LLVM). The following pairs have been tested to work for some hello-world binaries:

  • {linux, x86_64} -> {linux, aarch64}
  • {linux, aarch64} -> {linux, x86_64}
  • {darwin, x86_64} -> {linux, x86_64}
  • {darwin, x86_64} -> {linux, aarch64}

A recommended approach would be to define two toolchains, one without sysroot for single-platform builds, and one with sysroot for cross-compilation builds. Then, when cross-compiling, explicitly specify the toolchain with the sysroot and the target platform. For example, see the WORKSPACE file and the test script for cross-compilation.

bazel build \
  --platforms=@com_grail_bazel_toolchain//platforms:linux-x86_64 \
  --extra_toolchains=@llvm_toolchain_with_sysroot//:cc-toolchain-x86_64-linux \
  //...

Supporting New Target Platforms

The following is a rough (untested) list of steps:

  1. To help us detect if you are cross-compiling or not, note the arch string as given by python3 -c 'import platform; print(platform.machine()).
  2. Edit SUPPORTED_TARGETS in toolchain/internal/common.bzl with the os and the arch string from above.
  3. Add target_system_name, etc. in toolchain/cc_toolchain_config.bzl.
  4. For cross-compiling, add a platform bazel type for your target platform in platforms/BUILD.bazel, and add an appropriate sysroot entry to your llvm_toolchain repository definition.
  5. If not cross-compiling, bring your own LLVM (see section above) through the toolchain_roots or urls attribute.
  6. Test your build.

Sandbox

Sandboxing the toolchain introduces a significant overhead (100ms per action, as of mid 2018). To overcome this, one can use --experimental_sandbox_base=/dev/shm. However, not all environments might have enough shared memory available to load all the files in memory. If this is a concern, you may set the attribute for using absolute paths, which will substitute templated paths to the toolchain as absolute paths. When running bazel actions, these paths will be available from inside the sandbox as part of the / read-only mount. Note that this will make your builds non-hermetic.

Compatibility

The toolchain is tested to work with rules_go, rules_rust, and rules_foreign_cc.

Accessing tools

The LLVM distribution also provides several tools like clang-format. You can depend on these tools directly in the bin directory of the distribution. When not using the toolchain_roots attribute, the distribution is available in the repo with the suffix _llvm appended to the name you used for the llvm_toolchain rule. For example, @llvm_toolchain_llvm//:bin/clang-format is a valid and visible target in the quickstart example above.

When using the toolchain_roots attribute, there is currently no single target that you can reference, and you may have to alias the tools you want with a select clause in your workspace.

As a convenience, some targets are aliased appropriately in the configuration repo (as opposed to the LLVM distribution repo) for you to use and will work even when using toolchain_roots. The complete list is in the file aliases.bzl. If your repo is named llvm_toolchain, then they can be referenced as:

  • @llvm_toolchain//:omp
  • @llvm_toolchain//:clang-format
  • @llvm_toolchain//:llvm-cov

Prior Art

Other examples of toolchain configuration:

https://bazel.build/tutorials/ccp-toolchain-config

https://github.com/vsco/bazel-toolchains

More Repositories

1

rules_nodejs

NodeJS toolchain for Bazel.
Starlark
727
star
2

rules_foreign_cc

Build rules for interfacing with "foreign" (non-Bazel) build systems (CMake, configure-make, GNU Make, boost, ninja, Meson)
Starlark
662
star
3

rules_jvm_external

Bazel rules to resolve, fetch and export Maven artifacts
Starlark
324
star
4

rules_oci

Bazel rules for building OCI containers
Starlark
265
star
5

vscode-bazel

Bazel support for Visual Studio Code
TypeScript
241
star
6

rules_dotnet

.NET rules for Bazel
Starlark
190
star
7

bazel-lib

Common useful functions for writing BUILD files and Starlark macros/rules
Starlark
137
star
8

target-determinator

Determines which Bazel targets were affected between two git commits.
Go
131
star
9

bazel-mypy-integration

🐍🌿💚 Integrate MyPy type-checking into your Python Bazel builds
Starlark
119
star
10

rules_fuzzing

Bazel Starlark extensions for defining fuzz tests in Bazel projects
Starlark
87
star
11

rules_cuda

Starlark implementation of bazel rules for CUDA.
Starlark
83
star
12

rules_jsonnet

Jsonnet rules for Bazel
Starlark
69
star
13

rules_jvm

Contributed Bazel rules that make working with java projects more pleasant
Java
46
star
14

rules-template

A template for creating a new Bazel ruleset
Starlark
45
star
15

rules_bazel_integration_test

Rules and macros for executing integration tests that use Bazel. Supports running integration tests with multiple versions of Bazel.
Starlark
40
star
16

rules_debian_packages

Rules for installing debian-packages into Docker-Images with bazel
Starlark
23
star
17

SIG-rules-authors

Governance and admin for the rules authors Special Interest Group
Shell
22
star
18

publish-to-bcr

A GitHub app that mirrors releases of your Bazel ruleset to the Central Registry
TypeScript
21
star
19

Bazel-learning-paths

Bazel training materials and codelabs focused on beginner, advanced and contributor learning paths
Java
19
star
20

bazel_features

Support Bazel "feature detection" from starlark
Starlark
17
star
21

musl-toolchain

Python
14
star
22

unused-jvm-deps

Tool to remove unused deps entries for Java
Java
14
star
23

bcr-ui

Website for the Bazel Central Registry
TypeScript
10
star
24

bazel-catalog

Catalog of Bazel rules. http://awesomebazel.com meets https://bazel.build/rules plus live stats.
Shell
3
star
25

target-determinator-testdata

Sample commits used to test implementations of target determinators.
1
star
26

.github

GitHub metadata for the org
1
star