• Stars
    star
    142
  • Rank 258,495 (Top 6 %)
  • Language
    Go
  • License
    MIT License
  • Created over 2 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A distributed C++ compiler: like distcc, but faster

nocc — a distributed C++ compiler

nocc propagates a compiler invocation to a remote machine: nocc g++ 1.cpp calls g++ remotely, not locally.

nocc speeds up compilation of large C++ projects: when you have multiple remotes, tons of local jobs are parallelized between them.

But its most significant effort is greatly speeding up re-compilation across build agents in CI/CD and across developers working on the same project: they use shared remote caches. Once a cpp file has been compiled, the resulting obj is used by other agents without launching compilation, actually.

nocc easily integrates into any build system, since a build system should only prefix executing commands.


The reason why nocc was created

nocc was created at VK.com to speed up KPHP compilation. KPHP is a PHP compiler: it converts PHP sources to C++. VK.com codebase is huge, for how we have about 150 000 autogenerated cpp files.

Our goal was to greatly improve the performance of the "C++ → binary" step.

Since 2014, we used distcc.
In 2019, we patched distcc to support precompiled headers. That gave us 5x to performance.
In 2021, we decided to implement a distcc replacement. Finally, we got 2x – 9x over the patched version.


Installation and configuration

The easiest way is just to download ready binaries — proceed to the releases page and download the latest .tar.gz for your system: you'll have 3 binaries after extracting.

You can also compile nocc from sources, see the installation page.

For a test launch (to make sure that everything works), proceed to this section.

For a list of command-line arguments and environment variables, visit the configuration page.


How does nocc work

Consider the following file named 1.cpp:

#include "1.h"

int square(int a) { 
  return a * a; 
}

Having 1.h be just like

int square(int a);

When you run nocc g++ 1.cpp -o 1.o -c, the compilation is done remotely:

one file

What's actually happening here:

  • nocc parses the command-line invocation: input files, include dirs, cxx flags, etc.
  • for an input file (1.cpp), nocc finds all dependencies: it traverses all #include recursively (which results in just one file 1.h here)
  • nocc uploads files to a server and waits
  • nocc-server executes the same command-line (same cxx flags, but modified paths)
  • nocc-server pushes a compiled object file back
  • nocc saves 1.o — the same as if compiled locally

Besides an object file, nocc-server pushes exitCode/stdout/stderr of the C++ compiler: nocc process uses them as a self output.

In production, you have multiple compilation servers

Conceptually, you can think of a working scheme like this:

many files

Lots of nocc processes are launched simultaneously — much more than you could launch if you use g++ locally.

Every nocc invocation handles exactly one .cpp -> .o compilation, it's by design. It does remote compilation and dies — nocc is just a front-end layer between any build system and a real C++ compiler.

For every invocation, a remote server is chosen, all dependencies are detected, missing dependencies are uploaded, and the server streams back a ready obj file. This happens in parallel for all command lines.

Actually, to be more efficient, all connections are served via one background nocc-daemon:

daemon

nocc-daemon is written in Go, whereas nocc is a very lightweight C++ wrapper, the only aim of which is to pipe command-line to a daemon, wait for the response, and die.

So, a final working scheme is the following:

  1. The very first nocc invocation starts nocc-daemon: a daemon serves grpc connections and actually does all stuff for remote compilation.
  2. Every nocc invocation pipes a command-line (g++ ...) to a daemon via Unix socket, a daemon compiles it remotely and writes the resulting .o file, then nocc process dies.
  3. nocc jobs start and die: a build system executes and balances them.
  4. nocc-daemon dies in 15 seconds after nocc stops connecting (after the compilation process finishes).

For more info, consider the nocc architecture page.


nocc is also a remote src/obj cache

The main idea behind nocc is that the 2nd, the 3rd, the Nth runs are faster than the first. Even if you clean a build directory, even on another machine, even in a renamed folder.

That's because of remote caches.
nocc does not upload files if they have already been uploaded — that's the src cache.
nocc does not compile files if they have already been compiled — that's the obj cache.

second run

Such an approach dramatically decreases compilation times if your CI has different build machines or your builds start from a fresh copy. Moreover, git branch switching and merging is also a great target for remote caching.


nocc and CMake

When CMake generates a buildfile for your C++ project, you typically launch the build process with make or ninja. These build systems launch and balance processes and keep doing it until all C++ files are compiled.

Our goal is to tell CMake to launch nocc g++ instead of g++ (or any other C++ compiler). This can be done with -DCMAKE_CXX_COMPILER_LAUNCHER:

cmake -DCMAKE_CXX_COMPILER_LAUNCHER=/path/to/nocc ..

Then make building would look like this:

make

CMake sometimes invokes the C++ compiler with -MD/-MT flags to generate a dependency list. nocc supports them out of the box, depfiles are generated on a client-side.


nocc and ninja

Ninja is a build system, easily integrated to CMake instead of make.

nocc works with ninja, but there are 2 points to care about:

  1. Explicitly set -j {jobs} (typically, you don't do this with ninja, then it automatically spreads jobs across machine CPUs, but we need {jobs} to be a huge number).
  2. There is an upsetting defect that (whyever) ninja incrementally waits for a daemon to die. A workaround is to launch a daemon manually in advance. Read more about this problem.


nocc and KPHP

Originally, nocc was created to speed up compiling large KPHP projects, with lots of autogenerated C++ files. KPHP does not call make: it has a build system right inside itself.

To use nocc with KPHP, just set the KPHP_CXX=nocc g++ environment variable. Then nocc will be used for both C++ compilation and precompiled headers generation.


Precompiled headers support

nocc treats precompiled headers in a special way. When a client command to generate pch is executed,

nocc g++ -x c++-header -o all-headers.h.gch all-headers.h

then nocc emits all-headers.h.nocc-pch, whereas all-headers.h.gch is not produced at all. This is a text file containing all dependencies — compiled on a server-side into a real .gch/.pch.

Generating a .nocc-pch file is much faster than generating a real precompiled header, so it's acceptable to call it for every build — anyway, it will be compiled remotely only once.

Here you can read more about own precompiled headers.


nocc vs ccache

It's quite incorrect to compare nocc with ccache, as ccache is not intended to parallelize compilation on remotes. ccache can speed up compilation performed locally (especially useful when you switch git branches), but when it comes to compiling a huge number of C++ files from scratch, everything is still done locally.

nocc also greatly speeds up re-compilation when switching branches. But nocc does it in a completely different ideological way: using remote caches.


nocc vs distcc

Because nocc was targeted as a distcc replacement, a detailed analysis of their differences is written on the compare with distcc page.

That page includes an architecture overview, some info about patching distcc with pch support, and real build times from VK.com production.


What makes nocc so fast

nocc architecture is specially tuned to be as fast as possible for typical usage scenarios.

  • nocc-daemon keeps all connections alive, while nocc processes start and die during a build
  • to resolve all recursive #include, nocc does not invoke preprocessor: it uses its own parser instead
  • nocc-server has the src cache: once 1.h is uploaded by any client, no other clients need to upload this file again (unless changed)
  • nocc-server has the obj cache: once 1.cpp is compiled by any client, all other clients receive 1.o without compilation (if all dependencies and flags match)
  • for a file.cpp, one and the same server is chosen every time to make remote caches reusable
  • shared precompiled headers: once 1.gch compiled, no other build agents have to do it locally

Dig deeper into nocc architecture


FAQ

What are the conditions to make sure that a remote .o file would equal a local .o?

nocc assumes that all remotes have the C++ compiler of exactly the same version as local. That would ensure no difference, where exactly the compilation was launched if we have equal source files. Since linking is done locally, remotes are not required to have all libs needed for linking.

What if I #include <re2.h> but it doesn't exist on remote?

Everything would still work. When nocc traverses dependencies, it also finds all system headers recursively, their hash sums are sent to the remote along with the cpp file info. If some system includes are missing (or if they differ from local ones), they are also sent like regular files, saved to the /tmp folder representing client file structure, and discovered via special -isystem arguments added to the command-line.

How does nocc handle linking commands?

Linking is done locally. All commands that are unsupported or non-well-formed are done locally.

What happens if some servers are unavailable?

When nocc tries to compile 1.cpp remotely, but the server is unavailable, nocc falls back to local compilation. It does not try another server, it's intentionally.

Does nocc support clang?

Theoretically, there should be no difference, what compiler is being used: g++, or clang++, or /usr/bin/c++, etc. Even .pch files are supposed to work, as pch compilation is done remotely. Small tests for clang work well, but it hasn't been tested well in production, as we use only g++ in KPHP and VK.com for now.

What is the optimal job/server count?

The final number that we fixated at VK.com is "launch ~20 jobs for one server". For example, we have 32 compilation servers, and we launch ~600 jobs for C++ compilation. This works well both when files are compiled and when they are just taken from obj cache. Note, that if you use a large number of parallel jobs, you'd probably have to increase ulimit -n, as nocc-daemon reads lots of files and keeps all connections to nocc C++ wrappers simultaneously.

I get an error "compiling locally: rpc error: code = Unknown desc = file xxx.cpp was already uploaded, but now got another sha256 from client"

This error occurs in such a scenario: you compile a file, they quickly modify it, and launch compilation again — a previous nocc-daemon is still running, previous file structure is still mapped to servers. Then the compilation for such file is done locally. In reality, such an error never occurs, as big projects take some time for linking/finalization after compilation (a daemon dies in 15 seconds).

Why did you name this tool "nocc"?

We already have a PHP linter named noverify and an architecture validation tool nocolor. That's why "nocc" — just because I like such naming :)

More Repositories

1

kphp

KPHP — a PHP compiler
C++
1,322
star
2

VKUI

VKUI – это набор React-компонентов, с помощью которых можно создавать интерфейсы, внешне неотличимые от наших iOS и Android приложений.
TypeScript
995
star
3

YouTokenToMe

Unsupervised text tokenizer focused on computational efficiency
C++
951
star
4

noverify

Pretty fast linter (code static analysis utility) for PHP
Go
667
star
5

vk-android-sdk

Android library for working with VK API, authorization through VK app, using VK functions.
Kotlin
458
star
6

vk-ios-sdk

iOS library for working with VK API, authorization through VK app, using VK functions
Objective-C
298
star
7

vk-java-sdk

Java library for working with VK API
Java
290
star
8

vk-api-schema

JSON Schema of VK API
Shell
206
star
9

statshouse

StatsHouse is a highly available, scalable, multitenant monitoring system
C
206
star
10

vk-php-sdk

PHP library for working with VK API
PHP
204
star
11

vkompose

Kotlin Compiler Plugins, an IDEA Plugin, and a Detekt Rule that will help to improve your experience with Jetpack Compose
Kotlin
190
star
12

kittenhouse

Go
185
star
13

lighthouse

Lightweight interface for ClickHouse
JavaScript
185
star
14

joy4

Golang audio/video library and streaming server
Go
180
star
15

nocolor

Validate the architecture of your PHP project based on the concept of function colors
Go
161
star
16

nginx-quic

C
151
star
17

KNet

Android network library with QUIC protocol supporting.
Kotlin
148
star
18

bot-example-php

Пример бота для VK
PHP
134
star
19

icons

Набор SVG иконок, представленный в виде React компонентов.
JavaScript
124
star
20

vk-bridge

A package for integrating VK Mini Apps with official VK clients for iOS, Android and Web
TypeScript
70
star
21

php-parser

PHP parser written in Go
Go
69
star
22

modulite

A plugin for PHPStorm that brings modules to the PHP language
Kotlin
65
star
23

vk-qr

VK QR Code generator library
TypeScript
58
star
24

create-vk-mini-app

Create VK Apps with no build configuration.
TypeScript
53
star
25

vk-miniapps-deploy

NPM module for deploy VK Mini Apps on VK hosting
JavaScript
49
star
26

kphpstorm

A PhpStorm plugin that makes IDE understand KPHP specifics
Kotlin
41
star
27

vkui-tokens

TypeScript
39
star
28

fastXDM

fast library for cross-domain messaging
JavaScript
39
star
29

node-vk-call

Simple API wrapper for VK.com social network
JavaScript
35
star
30

elephize

Typescript to PHP translation tool
TypeScript
33
star
31

vk-streaming-api

Go
33
star
32

vk-mini-apps-api

The official package for quick and easy development of VK Mini Apps
TypeScript
28
star
33

vk-mini-apps-router

TypeScript
27
star
34

Appearance

JavaScript
26
star
35

vkid-android-sdk

Kotlin
25
star
36

vkjs

VK shared JS libs
TypeScript
23
star
37

vk-router

TypeScript
22
star
38

vk-windowsphone-sdk

VK SDK for Windows Phone
C#
22
star
39

admstorm

PhpStorm plugin aimed at simplifying tasks at the junction of the local repository and the repository on the dev server
Kotlin
20
star
40

vk-unity-sdk

C#
20
star
41

vk-tunnel-client

TypeScript
19
star
42

kive

Go
19
star
43

tl

C++
18
star
44

vkdata-sketchplugin

Sketch plugin for using data from your account at vk.com
JavaScript
17
star
45

vk-apps-launch-params

Пример работы с параметрами запуска
JavaScript
17
star
46

nginx-http-vkupload-module

C
16
star
47

kphp-polyfills

PHP implementations of functions supported by KPHP natively (a Composer package)
PHP
15
star
48

superappkit-android-demo

Kotlin
15
star
49

vkid-web-sdk

TypeScript
15
star
50

vk-mini-apps-examples

TypeScript
15
star
51

IOSDevice

A set of hacks and workarounds for iOS Safari & Co.
JavaScript
14
star
52

docker-emulator-android

Dockerfile
13
star
53

modulite-phpstan

Bring modules into PHP and PHPStan
PHP
13
star
54

vk-apps-tensorflow-example

VK apps + tensorflow-js demo app
JavaScript
12
star
55

api-schema-typescript-generator

TypeScript
11
star
56

vkid-ios-sdk

Swift
11
star
57

api-schema-typescript

TypeScript
10
star
58

VKSDK-iOS

Swift
10
star
59

Delegate

Python
10
star
60

engine-go

Common libraries for our go engines (microservices)
Go
10
star
61

vk-direct-games-example

JavaScript
10
star
62

vk-ios-urlprotocol-example

This is an example iOS app with custom URLProtocol
Swift
10
star
63

swc-plugin-css-modules

Rust
9
star
64

vk-bridge-mock

The VK Bridge mock library
TypeScript
9
star
65

ktest

Test and benchmark KPHP code
Go
9
star
66

vk-ads-retargeting-demo

Демонстрация JavaScript API ретаргетинга ВКонтакте
HTML
8
star
67

eslint-config

JavaScript
8
star
68

useWeb3

JavaScript
8
star
69

TL-Schema-idea-plugin

Plugin for JetBrains products for coloring TL Schema files
Java
8
star
70

vk-connect-promise

A package for integrating VK Mini Apps with official VK clients for iOS, Android and Web with events based on promises
JavaScript
8
star
71

torch_mobile

Torch7 for mobile devices
C
7
star
72

vkui-benchmarks

JavaScript
7
star
73

noverify-phpstorm

NoVerify plugin for PhpStorm
Kotlin
6
star
74

superappkit-ios

Ruby
6
star
75

swc-plugin-transform-remove-imports

Rust
6
star
76

VideoPlayer-iOS

Swift
6
star
77

statshouse-go

StatsHouse client library for Go
Go
6
star
78

create-vkui-app

JavaScript
6
star
79

m3u8

Parser and generator of M3U8-playlists for Apple HLS.
Go
5
star
80

nginx-statshouse-module

StatsHouse module for nginx
C
5
star
81

statshouse-cpp

StatsHouse client library for C++
C++
5
star
82

statshouse-php

StatsHouse client library for PHP and KPHP
PHP
5
star
83

stylelint-config

TypeScript
4
star
84

statshouse-java

Java
4
star
85

modulite-example-project

This example project contains some Modulite errors, detected by IDE, PHPStan, and KPHP
PHP
4
star
86

kphp-tools

A set of independent tools to work with KPHP compiled code
JavaScript
4
star
87

kphp-snippets

Libraries written in PHP aimed to be compiled with KPHP
PHP
4
star
88

vk-mini-apps-course-frontend

TypeScript
4
star
89

graph-cache

Easy way to build and maintain persistent dependency graph for any type of files/languges
JavaScript
4
star
90

gulp-portal

JavaScript
4
star
91

sprites

Module for generate SVG sprites and PNG fallback
JavaScript
4
star
92

swc-plugin-pre-paths

Rust
3
star
93

mask-assets

AngelScript
3
star
94

mini-apps-analytics

TypeScript
3
star
95

vk-apps-currency

JavaScript
3
star
96

eslint-plugin

JavaScript
3
star
97

vk-apps-qr

VK Apps + QR demo app
JavaScript
2
star
98

ktest-script

PHP
2
star
99

mvk-mini-apps-scroll-helper

JavaScript
2
star
100

prettier-config

JavaScript
2
star