• Stars
    star
    177
  • Rank 214,676 (Top 5 %)
  • Language
    C
  • Created about 9 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Self-contained C/C++ profiler library for Linux

Prof

Self-contained C/C++ profiler library for Linux.

Prof offers a quick way to measure performance events (CPU clock cycles, cache misses, branch mispredictions, etc.) of C/C++ code snippets. Prof is just a wrapper around the perf_event_open system call, its main goal is to be easy to setup and painless to use for targeted optimizations, namely, when the hot spot has already been identified. In no way Prof is a replacement for a fully-fledged profiler like perf, gprof, callgrind, etc.

Please be aware that Prof uses __attribute__((constructor)) to be as more straightforward to setup as possible, so it cannot be included more than once.

Examples

Minimal

The following snippet prints the rough number of CPU clock cycles spent in executing the code between the two Prof calls:

#include "prof.h"

int main()
{
    PROF_START();
    // slow code goes here...
    PROF_STDOUT();
}

Custom options

The following snippet instead counts both read and write faults of the level 1 data cache that occur in the userland code between the two Prof calls:

#include <stdio.h>

#define PROF_USER_EVENTS_ONLY
#define PROF_EVENT_LIST \
    PROF_EVENT_CACHE(L1D, READ, MISS) \
    PROF_EVENT_CACHE(L1D, WRITE, MISS)
#include "prof.h"

int main()
{
    uint64_t faults[2] = { 0 };

    PROF_START();
    // slow code goes here...
    PROF_DO(faults[index] += counter);

    // fast or uninteresting code goes here...

    PROF_START();
    // slow code goes here...
    PROF_DO(faults[index] += counter);

    printf("Total L1 faults: R = %lu; W = %lu\n", faults[0], faults[1]);
}

Installation

Just include prof.h. Here is a quick way to fetch the latest version:

wget -q https://raw.githubusercontent.com/cyrus-and/prof/master/prof.h

Setup

Since Prof uses perf_event_open make sure to have the permission to access the performance counters: either run the program as superuser (discouraged) or set the value of perf_event_paranoid appropriately, for example:

$ echo 1 | sudo tee /proc/sys/kernel/perf_event_paranoid

Optionally make it permanent with:

$ echo 'kernel.perf_event_paranoid=1' | sudo tee /etc/sysctl.d/local.conf

See man perf_event_open for more information.

API

PROF_START()

Reset the counters and (re)start counting the events.

The events to be monitored are specified by setting the PROF_EVENT_LIST macro before including this file to a list of PROF_EVENT_* invocations; defaults to counting the number CPU clock cycles.

If the PROF_USER_EVENTS_ONLY macro is defined before including this file then kernel and hypervisor events are excluded from the count.

PROF_EVENT(type, config)

Specify an event to be monitored, type and config are defined in the documentation of the perf_event_open system call.

PROF_EVENT_HW(config)

Same as PROF_EVENT but for hardware events; prefix PERF_COUNT_HW_ must be omitted from config.

PROF_EVENT_SW(config)

Same as PROF_EVENT but for software events; prefix PERF_COUNT_SW_ must be omitted from config.

PROF_EVENT_CACHE(cache, op, result)

Same as PROF_EVENT but for cache events; prefixes PERF_COUNT_HW_CACHE_, PERF_COUNT_HW_CACHE_OP_ and PERF_COUNT_HW_CACHE_RESULT_ must be omitted from cache, op and result, respectively. Again cache, op and result are defined in the documentation of the perf_event_open system call.

PROF_STOP()

Stop counting the events. The counter array can then be accessed with PROF_COUNTERS.

PROF_COUNTERS

Access the counter array. The order of counters is the same of the events defined in PROF_EVENT_LIST. Elements of this array are 64 bit unsigned integers.

PROF_DO(block)

Stop counting the events and execute the code provided by block for each event. Within code: index refers to the event position index in the counter array defined by PROF_COUNTERS; counter is the actual value of the counter. index is a 64 bit unsigned integer.

PROF_CALL(callback)

Same as PROF_DO except that callback is the name of a callable object (e.g. a function) which, for each event, is be called with the two parameters index and counter.

PROF_FILE(file)

Stop counting the events and write to file (a stdio.h FILE *) as many lines as are events in PROF_EVENT_LIST. Each line contains index and counter (as defined by PROF_DO) separated by a tabulation character. If there is only one event then index is omitted.

PROF_STDOUT()

Same as PROF_LOG_FILE except that file is stdout.

PROF_STDERR()

Same as PROF_LOG_FILE except that file is stderr.

License

Copyright (c) 2020 Andrea Cardaci [email protected]

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

More Repositories

1

gdb-dashboard

Modular visual interface for GDB in Python
Python
10,856
star
2

chrome-remote-interface

Chrome Debugging Protocol interface for Node.js
JavaScript
4,235
star
3

chrome-har-capturer

Capture HAR files from a Chrome instance
JavaScript
527
star
4

zoom

Fixed and automatic balanced window layout for Emacs
Emacs Lisp
355
star
5

zizzania

Automated DeAuth attack
C
275
star
6

fracker

PHP function tracker
JavaScript
241
star
7

mysql-unsha1

Authenticate against a MySQL server without knowing the cleartext password
C
222
star
8

gdb

Go GDB/MI interface
Go
79
star
9

comb

Interactive code auditing and grep tool in Emacs Lisp
Emacs Lisp
74
star
10

httpfs

Remote FUSE filesystem via server-side script
C
61
star
11

gproxy

googleusercontent.com as HTTP(S) proxy
JavaScript
54
star
12

trace

Start or attach to a process and monitor a customizable set of metrics (CPU, I/O, etc.)
Shell
34
star
13

chrome-page-graph

Chrome extension to generate interactive page dependency graphs
JavaScript
32
star
14

xkeylogger

Rootless keylogger for X
C
32
star
15

signal-wont-let-me-attach

Store arbitrary files inside PNGs to overcome nonsensical file type restrictions
Python
30
star
16

iq

I/Q file analysis toolkit in R
R
25
star
17

ratty

Record and replay terminal sessions
JavaScript
10
star
18

httpool

Go HTTP wrapper for limited concurrency handlers
Go
9
star
19

cyrus-and.github.io

Personal website
SCSS
7
star
20

lorem

Lorem ipsum generator as a Linux kernel module
C
6
star
21

biscuit

Modular HTTP cookie parser
Python
5
star
22

stash

Shell I/O clipboard
Shell
5
star
23

dotfiles

Personal dotfiles
Emacs Lisp
5
star
24

signal-desktop-docker

Scaffold to run Signal Desktop in a Docker container and persist data
Dockerfile
4
star
25

playground

Disposable Docker sandbox for quick isolated testing with X support
Dockerfile
4
star
26

dry-makefile

Opinionated Makefile for simple C/C++ projects
Makefile
3
star
27

synchttp

Synchronous Node.js HTTP and WebSocket library for API testing, scripting or automation
JavaScript
2
star