• Stars
    star
    2,715
  • Rank 16,212 (Top 0.4 %)
  • Language
    C++
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Demonstration of various hardware effects.

Hardware effects

This repository demonstrates various hardware effects that can degrade application performance in surprising ways and that may be very hard to explain without knowledge of the low-level CPU and OS architecture. For each effect I try to create a proof of concept program that is as small as possible so that it can be understood easily.

Related repository with GPU hardware effects: https://github.com/kobzol/hardware-effects-gpu

Those effects obviously depend heavily on your CPU microarchitecture and model, so the demonstration programs may not showcase the slowdown on your CPU, but I try to make them as general as I can. That said, the examples are targeting x86-64 processors (Intel and AMD) and may not make sense on other CPU architectures. I focus on effects that should be observable on commodity (desktop/notebook) hardware, so I don't include things like NUMA effects here (although in a few years they might be common even in personal computers). The code is mainly tested on Linux.

Currently the following effects are demonstrated:

  • 4k aliasing
  • bandwidth saturation
  • branch misprediction
  • branch target misprediction
  • cache conflicts
  • cache/memory hierarchy bandwidth
  • data dependencies
  • denormal floating point numbers
  • DRAM refresh interval
  • false sharing
  • hardware prefetching
  • hardware store elimination
  • memory-bound program
  • misaligned accesses
  • non-temporal stores
  • software prefetching
  • store buffer capacity
  • write combining

Every example directory has a README that explains the individual effects.

Isolating those hardware effects can be very tricky, so it's possible that some of the examples are actually demonstrating something entirely else (or nothing at all :) ). If you have a better explanation of what is happening, please let me know in the issues. Ideally the code should be written in assembly, however that would lower its readability. I wrote it in C++ in a way that (hopefully) forces the compiler to emit the instructions that I want (even with -O3).

Benchmarking

For all benchmarks I recommend to turn off frequency scaling, hyper-threading, Turbo mode, address space randomization and other stuff that can increase noise. I'm using the following commands:

$ sudo bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"           # address randomization
$ sudo bash -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo" # Turbo mode
$ sudo bash -c "echo 0 > /sys/devices/system/cpu/cpuX/online"           # hyper-threading
$ ...                                                                   # for all hyper-threading CPUs
$ sudo cpupower frequency-set --governor performance                    # frequency scaling

You can find more tips here.

Build

$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release ..
$ make -j

If you want to use the benchmark scripts (written in Python 3), you should also install the Python dependencies:

$ pip install -r requirements.txt

Docker

You can download a prebuilt image:

$ docker pull kobzol/hardware-effects

or build it yourself:

$ docker build -t hardware-effects .

Then run it:

# interactive run
$ docker run --rm -it hardware-effects

# directly launch a program
$ docker run hardware-effects build/branch-misprediction/branch-misprediction 1

License

MIT

Resources

More Repositories

1

cargo-wizard

Cargo subcommand for configuring Cargo projects for best performance.
Rust
583
star
2

cargo-pgo

Cargo subcommand for optimizing Rust binaries/libraries with PGO and BOLT.
Rust
445
star
3

rust-delegate

Rust method delegation with less boilerplate
Rust
388
star
4

hardware-effects-gpu

Demonstration of various hardware effects on CUDA GPUs.
C++
304
star
5

cargo-remark

Cargo subcommand for viewing LLVM optimization remarks.
Rust
111
star
6

davis

Assembly debugger written in Angular 2.
TypeScript
56
star
7

rust-course-fei

Rust course taught at FEI VŠB-TUO.
Rust
13
star
8

sigmod-2018

Code for the SIGMOD 2018 programming contest. Finished at 2nd place.
C++
12
star
9

debug-visualizer

Program memory visualizer for GDB/LLDB (bachelor thesis)
Python
10
star
10

sigmod-2019

Code for the SIGMOD 2019 programming contest. Finished at 2nd place.
C++
8
star
11

llvm-instrument

LLVM instrumentation
C++
6
star
12

rustlang.cz

Web that gathers information about the Rust community in the Czech Republic.
HTML
6
star
13

advent-of-code

Advent of code solutions
Python
4
star
14

cuda-profile

Instrumentation based profiler for CUDA (master thesis)
C++
3
star
15

sigmod-2016

Code for the SIGMOD 2016 programming contest. Finished at 14th place.
C++
3
star
16

cfggen

Python configuration generator
Python
3
star
17

llvm-se

Static analysis using symbolic execution on top of LLVM IR
C++
2
star
18

talks

Source code and slides for my public talks.
Python
2
star
19

handmade-quake

Quake recreated by following the tutorial from Philip Buuck (https://www.youtube.com/channel/UCXgjH2-Mrb3-h1_iWurz7dQ).
C
2
star
20

kobzol

2
star
21

async-iterator-examples

Examples of Rust async iterators
Rust
2
star
22

kobzol.github.io

Blog about programming stuff.
HTML
2
star
23

rust-web-app-demo

Demo of a small newsletter web app in Rust.
Rust
2
star
24

Spaceships

Android (Java) 2D game project made as a school assignment.
Java
1
star
25

Ghrab-Robot

Projekt robotického kroužku Gymnázia Ostrava-Hrabůvka.
C
1
star
26

cuda-graph

BFS implemented in CUDA.
C++
1
star
27

Computer-Graphics-I

Code for subject Computer Graphics I at VSB-TUO.
C++
1
star
28

agu

Algorithmisation of Geometrical Problems VSB-TUO course
C++
1
star
29

ZPG-project

Project for ZPG (Principles of Computer Graphics).
C
1
star
30

turret

School project, (somehow modified) clone of Tower defense.
Java
1
star
31

sigmod-2017

Code for the SIGMOD 2017 programming contest. Finished at 15th place.
C++
1
star
32

valgrind-se

Symbolic execution in Valgrind. Based on https://github.com/spirali/aislinn.
C
1
star
33

arduino-tetris

Classic tetris game displayed on 8x8 LED Matrix (MAX72xx) on Arduino
C++
1
star
34

elsie-gallery

Python
1
star
35

mkdocs-nedoc-plugin

Mkdocs plugin for the nedoc Python API documentation generator.
Python
1
star
36

rust-cmd-spawn-bench

Benchmark for process spawning in Rust, on Linux.
Python
1
star
37

pyladies-extended

Jupyter Notebook
1
star