• Stars
    star
    185
  • Rank 208,271 (Top 5 %)
  • Language
    C++
  • License
    MIT License
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A micro microbenchmarking library for C++11 in a single header file

picobench

Language Standard License

Test

picobench is a tiny (micro) microbenchmarking library in a single header file.

It's designed to be easy to use and integrate and fast to compile while covering the most common features of a microbenchmarking library.

Example usage

Here's the complete code of a microbenchmark which compares adding elements to a std::vector with and without using reserve:

#define PICOBENCH_IMPLEMENT_WITH_MAIN
#include "picobench/picobench.hpp"

#include <vector>
#include <cstdlib> // for rand

// Benchmarking function written by the user:
static void rand_vector(picobench::state& s)
{
    std::vector<int> v;
    for (auto _ : s)
    {
        v.push_back(rand());
    }
}
PICOBENCH(rand_vector); // Register the above function with picobench

// Another benchmarking function:
static void rand_vector_reserve(picobench::state& s)
{
    std::vector<int> v;
    v.reserve(s.iterations());
    for (auto _ : s)
    {
        v.push_back(rand());
    }
}
PICOBENCH(rand_vector_reserve);

The output of this benchmark might look like this:

===============================================================================
    Name (* = baseline)   |   Dim   |  Total ms |  ns/op  |Baseline| Ops/second
===============================================================================
            rand_vector * |       8 |     0.001 |     167 |      - |  5974607.9
      rand_vector_reserve |       8 |     0.000 |      55 |  0.329 | 18181818.1
            rand_vector * |      64 |     0.004 |      69 |      - | 14343343.8
      rand_vector_reserve |      64 |     0.002 |      27 |  0.400 | 35854341.7
            rand_vector * |     512 |     0.017 |      33 |      - | 30192239.7
      rand_vector_reserve |     512 |     0.012 |      23 |  0.710 | 42496679.9
            rand_vector * |    4096 |     0.181 |      44 |      - | 22607850.9
      rand_vector_reserve |    4096 |     0.095 |      23 |  0.527 | 42891848.9
            rand_vector * |    8196 |     0.266 |      32 |      - | 30868196.3
      rand_vector_reserve |    8196 |     0.207 |      25 |  0.778 | 39668749.5
===============================================================================

...which tells us that we see a noticeable performance gain when we use reserve but the effect gets less prominent for bigger numbers of elements inserted.

Documentation

To use picobench, you need to include picobench.hpp by either copying it inside your project or adding this repo as a submodule to yours.

In one compilation unit (.cpp file) in the module (typically the benchmark executable) in which you use picobench, you need to define PICOBENCH_IMPLEMENT_WITH_MAIN (or PICOBENCH_IMPLEMENT if you want to write your own main function).

Creating benchmarks

A benchmark is a function which you're written with the signature void (picobench::state& s). You need to register the function with the macro PICOBENCH(func_name) where the only argument is the function's name as shown in the example above.

The library will run the benchmark function several times with different numbers of iterations, to simulate different problem spaces, then collect the results in a report.

Typically a benchmark has a loop. To run the loop, use the picobench::state argument in a range-based for loop in your function. The time spent looping is measured for the benchmark. You can have initialization/deinitialization code outside of the loop and it won't be measured.

You can have multiple benchmarks in multiple files. All of them will be run when the executable starts.

Use state::iterations as shown in the example to make initialization based on how many iterations the loop will make.

If you don't want the automatic time measurement, you can use state::start_timer and state::stop_timer to manually measure it, or use the RAII class picobench::scope for semi-automatic measurement.

Here's an example of a couple of benchmarks, which does not use the range-based for loop for time measurement:

void my_func(); // Function you want to benchmark
static void benchmark_my_func(picobench::state& s)
{
    s.start_timer(); // Manual start
    for (int i=0; i<s.iterations(); ++i)
        my_func();
    s.stop_timer(); // Manual stop
}
PICOBENCH(benchmark_my_func);

void my_func2();
static void benchmark_my_func2(picobench::state& s)
{
    custom_init(); // Some user-defined initialization
    picobench::scope scope(s); // Constructor starts measurement. Destructor stops it
    for (int i=0; i<s.iterations(); ++i)
        my_func2();
}
PICOBENCH(benchmark_my_func2);

Custom main function

If you write your own main function, you need to add the following to it in order to run the benchmarks:

    picobench::runner runner;
    // Optionally parse command line
    runner.parse_cmd_line(argc, argv);
    return runner.run();

For even finer control of the run, instead of run you might call the functions explicitly:

    picobench::runner runner;
    // Optionally parse command line
    runner.parse_cmd_line(argc, argv);
    if (runner.should_run()) // Cmd line may have disabled benchmarks
    {
        runner.run_benchmarks();
        auto report = runner.generate_report();
        // Then to output the data in the report use
        report.to_text(std::cout); // Default
        // or
        report.to_text_concise(std::cout); // No iterations breakdown
        // or
        report.to_csv(std::cout); // Otputs in csv format. Most detailed
    }

Instead of std::cout you may want to use another std::ostream instance of your choice.

As mentioned above report.to_text_concise(ostream) outputs a report without the iterations breakdown. With the first example of benchmarking adding elements to a std::vector, the output would be this:

===============================================================================
    Name (* = baseline)   |  ns/op  | Baseline |  Ops/second
===============================================================================
            rand_vector * |      36 |        - |  27427782.7
      rand_vector_reserve |      24 |    0.667 |  40754573.7
===============================================================================

Note that in this case the information that the effect of using reserve gets less prominent with more elements is lost.

Suites

You can optionally create suites of benchmarks. If you don't, all benchmarks in the module are assumed to be in the default suite.

To create a suite, write PICOBENCH_SUITE("suite name in quotes"); and then every benchmark below this line will be a part of this suite. You can have benchmakrs in many files in the same suite. Just use the same string for its name.

Baseline

All benchmarks in a suite are assumed to be related and one of them is dubbed a "baseline". In the report at the end, all others will be compared to it.

By default the first benchmark added to a suite is the baseline, but you can change this by adding .baseline() to the registration like so: PICOBENCH(my_benchmark).baseline().

Samples

Sometimes the code being benchmarked is very sensitive to external factors such as syscalls (which include memory allocation and deallocation). Those external factors can have take greatly different times between runs. In such cases several samples of a benchmark might be needed to more precisely measure the time it takes to complete. By default the library makes two samples of each benchmark, but you can change this by adding .samples(n) to the registration like so: PICOBENCH(my_benchmark).samples(10).

Note that the time written to the report is the one of the fastest sample.

Benchmark results

You can set a result for a benchmark using state::set_result. Here is an example of this:

void my_benchmark(picobench::state& s)
{
    int sum = 0;
    for (auto i : s)
    {
        sum += myfunc(*i);
    }
    s.set_result(sum);
}

By default results are not used. You can think of them as data sinks. Optionally however you can use them in two ways.

  • Compare across samples: By calling runner::set_compare_results_across_samples you will make the library compare results between the different samples of a benchmark and trigger an error if they differ.
  • Compare across benchmarks: By calling runner::set_compare_results_across_benchmarks you can make a more complex comparison which will compare the results from all benchmarks in a suite. You can use this if you compare different ways of calculating the same result.

By default results are compared by simple equality, but you can introduce you own function as an argument to runner::generate_report here is an example:

void my_benchmark(picobench::state& s)
{
    my_vector2 result;
    for (auto _ : s)
        result += my_vector_op();
    s.set_result(
        // new to preserve value past this function
        reinterpret_cast<result_t>(new my_vector2(result))); 
}

bool compare_vectors(result_t a, result_t b)
{
    auto v1 = reinterpret_cast<my_vector2*>(a);
    auto v2 = reinterpret_cast<my_vector2*>(b);
    return v1->x == v2->x && v1->y == v2->y;
}

...

auto report = runner.generate_report(compare_vectors);

Other options

Other characteristics of a benchmark are:

  • Iterations: (or "problem spaces") a vector of integers describing the set of iterations to be made for a benchmark. Set with .iterations({i1, i2, i3...}). The default is {8, 64, 512, 4096, 8196}.
  • Label: a string which is used for this benchmark in the report instead of the function name. Set with .label("my label")
  • User data: a user defined number (uintptr_t) assinged to a benchmark which can be accessed by state::user_data

You can combine the options by concatenating them like this: PICOBENCH(my_func).label("My Function").samples(2).iterations({1000, 10000, 50000});

If you write your own main function, you can set the default iterations and samples for all benchmarks with runner::set_default_state_iterations and runner::set_default_samples before calling runner::run_benchmarks.

If you parse the command line or use the library-provided main function you can also set the iterations and samples with command line args:

  • --iters=1000,5000,10000 will set the iterations for benchmarks which don't explicitly override them
  • --samples=5 will set the samples for benchmarks which don't explicitly override them

Other command line arguments

If you're using the library-provided main function, it will also handle the following command line arguments:

  • --out-fmt=<txt|con|csv> - sets the output report format to either full text, concise text or csv.
  • --output=<filename> - writes the output report to a given file
  • --compare-results - will compare results from benchmarks and trigger an error if they don't match.

Misc

  • The runner randomizes the benchmarks. To have the same order on every run and every platform, set an integer seed to runner::run_benchmarks.

Here's another example of a custom main function incporporating the above:

#define PICOBENCH_IMPLEMENT
#include "picobench/picobench.hpp"
...
int main()
{
    // User-defined code which makes global initializations
    custom_global_init();

    picobench::runner runner;
    // Disregard command-line for simplicity

    // Two sets of iterations
    runner.set_default_state_iterations({10000, 50000});

    // One sample per benchmark because the huge numbers are expected to compensate
    // for external factors
    runner.set_default_samples(1);

    // Run the benchmarks with some seed which guarantees the same order every time
    auto report = runner.run_benchmarks(123);

    // Output to some file
    report.to_csv(ofstream("my.csv"));

    return 0;
}

Contributing

Contributions in the form of issues and pull requests are welcome.

License

This software is distributed under the MIT Software License.

See accompanying file LICENSE.txt or copy here.

Copyright Β© 2017-2023 Borislav Stanimirov

More Repositories

1

dynamix

πŸ₯ A new take on polymorphism
C++
644
star
2

itlib

A collection of std-like single-header C++ libraries
C++
110
star
3

boost.mixin

Boost.Mixin is a C++ implementation of the Mixin idiom
C++
43
star
4

html5-gui-demo

Demos for using HTML 5 as a GUI for C++ apps
C++
41
star
5

mscharconv

<charconv> from Microsoft STL, but multi-platform
C++
37
star
6

mtime_cache

CLI gem to help make use of build artifacts cache in a CI system
Ruby
24
star
7

git-lfs-download

Download full or partial git-lfs repos without temporarily using 2x disk space
Ruby
21
star
8

cef-demos

Some demos and experiments with CEF
C++
19
star
9

cef-cmake

CMake helpers for CEF (the Chromium Embedded Framework)
CMake
18
star
10

b_stacktrace

A minimalistic single-header multi-platform C89 lib for stack tracing
C
12
star
11

yama

Yet Another MAthematical library
C++
11
star
12

cpp-dynamic-polymorphism

List of resources about modern dynamic polymorphism in C++.
11
star
13

dllpatch

CLI util: Poor man's rpath for Windows executables.
C++
9
star
14

fishnets

A WebSocket server and client library for C++17
C++
9
star
15

xec

A small library for multi-threaded execution contexts and task executors
C++
8
star
16

VSOpenFileFromDir

VSCode-style open file (ctrl-p) functionality for Visual Studio
C#
7
star
17

jalog

Just Another Logging library for C++
C++
7
star
18

furi

A header-only URI library for C99 and C++17
C
7
star
19

xmem

An alternative memory and smart pointer library for C++
C++
6
star
20

cmake-pch

A simple modern CMake (3.2+) script which adds a precompiled header to a target
CMake
6
star
21

kuzco

No touchy! Immutable state for C++
C++
5
star
22

huse

A C++ library for HUman-readable SErialization
C++
5
star
23

envo

An environment variable manager
Ruby
4
star
24

crake

A CMake generator written in Ruby
Ruby
4
star
25

doctest-util

Utilities for C++ unit testing with doctest
C++
4
star
26

cfc-demos

Demos for my talk CPU-Friendly Code
C++
4
star
27

mathgp

MathGP - a simple math library for game programming
C++
4
star
28

trex

πŸ¦– A header-only type registry library for C++11
C++
4
star
29

ibob.github.com

Personal website
HTML
4
star
30

icm

A collection of CMake modules
CMake
3
star
31

cpp-lib-template

A template for a C++ library
CMake
3
star
32

dynamix-c

A dynamic polymorphism library for C
C
3
star
33

BlockOut3000

A small 3D puzzle game
C++
3
star
34

maibo

Yet another C++ multi-platform multimedia library
C++
2
star
35

confy

A C++17 application configuration library
C++
2
star
36

doctest-lib

A CMake wrapper of https://github.com/doctest/doctest/ to make life easier
C++
2
star
37

c-utf8

utf8 ⇔ utf32 (Unicode) conversion for C an C++
C++
2
star
38

word-grid

A word game where you find words in adjacent cells in a grid
C++
2
star
39

vec-span-demo

Demo code for my talk "Beyond std::vector and std::span"
C++
2
star
40

natvis-join

A tool which joins multiple .natvis files into a single one
Ruby
2
star
41

jsc-standalone

Binaries for using Apple's JavaScriptCore as a standalone library
C
2
star
42

evoshooter

Global Game Jam project
C++
2
star
43

boost-trim

A trimmed-down subset of boost
C++
2
star
44

advent-of-code

My Advent of Code solutions
Ruby
1
star
45

ubsan-fp

A false positive in the undefined behavior sanitizer (pre clang 9)
CMake
1
star
46

splat

Small platform or compiler-specific macros and helpers for C and C++
C
1
star
47

mixquest

A game-like demo of DynaMix
C++
1
star
48

clang-fwd-decl-bug

clang fails to compile a forward declared template argument
C++
1
star
49

flashcards

A generator of flashcards
JavaScript
1
star
50

trie-map-bench

Comparing a trie map to other associative containers
CMake
1
star