• Stars
    star
    1,642
  • Rank 28,464 (Top 0.6 %)
  • Language
    Rust
  • License
    Other
  • Created almost 6 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Rust port of FlameGraph

Crates.io Documentation Codecov Dependency status

Inferno is a port of parts of the flamegraph toolkit to Rust, with the aim of improving the performance of the original flamegraph tools. The primary focus is on speeding up the stackcollapse-* tools that process output from various profiling tools into the "folded" format expected by the flamegraph plotting tool. So far, the focus has been on parsing profiling results from perf and DTrace. At the time of writing, inferno-collapse-perf is ~20x faster than stackcollapse-perf.pl and inferno-collapse-dtrace is ~20x faster than stackcollapse.pl (see compare.sh).

It is developed in part through live coding sessions, which you can find on YouTube.

Using Inferno

As a library

Inferno provides a library interface through the inferno crate. This will let you collapse stacks and produce flame graphs without going through the command line, and is intended for integration with external Rust tools like cargo-flamegraph.

As a binary

First of all, you may want to look into cargo flamegraph, which deals with much of the infrastructure for you!

If you want to use Inferno directly, then build your application in release mode and with debug symbols, and then run a profiler to gather profiling data. Once you have the data, pass it through the appropriate Inferno "collapser". Depending on your platform, this will look something like

$ # Linux
# perf record --call-graph dwarf target/release/mybin
$ perf script | inferno-collapse-perf > stacks.folded

or

$ # macOS
$ target/release/mybin &
$ pid=$!
# dtrace -x ustackframes=100 -n "profile-97 /pid == $pid/ { @[ustack()] = count(); } tick-60s { exit(0); }"  -o out.user_stacks
$ cat out.user_stacks | inferno-collapse-dtrace > stacks.folded

You can also use inferno-collapse-guess which should work on both perf and DTrace samples. In the end, you'll end up with a "folded stack" file. You can pass that file to inferno-flamegraph to generate a flame graph SVG:

$ cat stacks.folded | inferno-flamegraph > flamegraph.svg

You'll end up with an image like this:

colorized flamegraph output

Obtaining profiling data

To profile your application, you'll need to have a "profiler" installed. This will likely be perf or bpftrace on Linux, and DTrace on macOS. There are some great instructions on how to get started with these tools on Brendan Gregg's CPU Flame Graphs page.

On Linux, you may need to tweak a kernel config such as

$ echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid

to get profiling to work.

Performance

Comparison to the Perl implementation

To run Inferno's performance comparison, run ./compare.sh. It requires hyperfine, and you must make sure you also check out Inferno's submodules. In general, Inferno's perf and dtrace collapsers are ~20x faster than stackcollapse-*, and the sample collapser is ~10x faster.

Benchmarks

Inferno includes criterion benchmarks in benches/. Criterion saves its results in target/criterion/, and uses that to recognize changes in performance, which should make it easy to detect performance regressions while developing bugfixes and improvements.

You can run the benchmarks with cargo bench. Some results (YMMV):

My desktop computer (AMD Ryzen 5 2600X) gets (/N means N cores):

collapse/dtrace/1       time:   [8.2767 ms 8.2817 ms 8.2878 ms]
                        thrpt:  [159.08 MiB/s 159.20 MiB/s 159.29 MiB/s]
collapse/dtrace/12      time:   [3.8631 ms 3.8819 ms 3.9019 ms]
                        thrpt:  [337.89 MiB/s 339.63 MiB/s 341.28 MiB/s]

collapse/perf/1         time:   [16.386 ms 16.401 ms 16.416 ms]
                        thrpt:  [182.37 MiB/s 182.53 MiB/s 182.70 MiB/s]
collapse/perf/12        time:   [4.8056 ms 4.8254 ms 4.8460 ms]
                        thrpt:  [617.78 MiB/s 620.41 MiB/s 622.97 MiB/s]

collapse/sample         time:   [8.9132 ms 8.9196 ms 8.9264 ms]
                        thrpt:  [155.49 MiB/s 155.61 MiB/s 155.72 MiB/s]

flamegraph              time:   [16.071 ms 16.118 ms 16.215 ms]
                        thrpt:  [38.022 MiB/s 38.250 MiB/s 38.363 MiB/s]

My laptop (Intel Core i7-8650U) gets:

collapse/dtrace/1       time:   [8.3612 ms 8.3839 ms 8.4114 ms]
                        thrpt:  [156.74 MiB/s 157.25 MiB/s 157.68 MiB/s]
collapse/dtrace/8       time:   [3.4623 ms 3.4826 ms 3.5014 ms]
                        thrpt:  [376.54 MiB/s 378.58 MiB/s 380.79 MiB/s]

collapse/perf/1         time:   [15.723 ms 15.756 ms 15.798 ms]
                        thrpt:  [189.51 MiB/s 190.01 MiB/s 190.41 MiB/s]
collapse/perf/8         time:   [6.1391 ms 6.1554 ms 6.1715 ms]
                        thrpt:  [485.09 MiB/s 486.36 MiB/s 487.65 MiB/s]

collapse/sample         time:   [9.3194 ms 9.3429 ms 9.3719 ms]
                        thrpt:  [148.10 MiB/s 148.56 MiB/s 148.94 MiB/s]

flamegraph              time:   [16.490 ms 16.503 ms 16.518 ms]
                        thrpt:  [37.324 MiB/s 37.358 MiB/s 37.388 MiB/s]

License

Inferno is a port of @brendangregg's awesome original FlameGraph project, written in Perl, and owes its existence and pretty much of all of its functionality entirely to that project. Like FlameGraph, Inferno is licensed under the CDDL 1.0 to avoid any licensing issues. Specifically, the CDDL 1.0 grants

a world-wide, royalty-free, non-exclusive license under intellectual property rights (other than patent or trademark) Licensable by Initial Developer, to use, reproduce, modify, display, perform, sublicense and distribute the Original Software (or portions thereof), with or without Modifications, and/or as part of a Larger Work; and under Patent Claims infringed by the making, using or selling of Original Software, to make, have made, use, practice, sell, and offer for sale, and/or otherwise dispose of the Original Software (or portions thereof).

as long as the source is made available along with the license (3.1), both of which are true since you're reading this file!

More Repositories

1

left-right

A lock-free, read-optimized, concurrency primitive.
Rust
1,940
star
2

fantoccini

A high-level API for programmatically interacting with web pages through WebDriver.
Rust
1,085
star
3

configs

My configuration files
Vim Script
968
star
4

bus

Efficient, lock-free, bounded Rust broadcast channel
Rust
668
star
5

flurry

A port of Java's ConcurrentHashMap to Rust
Rust
516
star
6

rust-tcp

A learning experience in implementing TCP in Rust
Rust
488
star
7

rust-imap

IMAP client library for Rust
Rust
477
star
8

evmap

A lock-free, eventually consistent, concurrent multi-value map.
Rust
459
star
9

drwmutex

Distributed RWMutex in Go
Go
323
star
10

rust-ci-conf

Collection of CI configuration files for Rust projects
278
star
11

rustengan

https://fly.io/dist-sys/ in Rust
Rust
260
star
12

roget

Wordle Solver inspired by 3blue1brown
Rust
224
star
13

haphazard

Hazard pointers in Rust.
Rust
193
star
14

griddle

A HashMap variant that spreads resize load across inserts
Rust
188
star
15

stream-cancel

A Rust library for interrupting asynchronous streams.
Rust
155
star
16

wewerewondering

wewerewondering.com
Rust
154
star
17

tsunami

Rust crate for running one-off cloud jobs
Rust
154
star
18

faktory-rs

Rust bindings for Faktory clients and workers
Rust
149
star
19

rust-for-rustaceans.com

Source for https://rust-for-rustaceans.com/
CSS
146
star
20

buzz

A simple system tray application for notifying about unseen e-mail
Rust
129
star
21

volley

Volley is a benchmarking tool for measuring the performance of server networking stacks.
C
121
star
22

msql-srv

Bindings for writing a server that can act as MySQL/MariaDB
Rust
116
star
23

proximity-sort

Simple command-line utility for sorting inputs by proximity to a path argument
Rust
116
star
24

bustle

A benchmarking harness for concurrent key-value collections
Rust
114
star
25

tracing-timing

Inter-event timing metrics on top of tracing.
Rust
112
star
26

stuck

Rust
107
star
27

atone

A `VecDeque` (and `Vec`) variant that spreads resize load across pushes.
Rust
106
star
28

rust-ibverbs

Bindings for RDMA ibverbs through rdma-core
Rust
106
star
29

thesis

My PhD thesis (eventually)
TeX
93
star
30

openssh-rs

Scriptable SSH through OpenSSH in Rust
Rust
92
star
31

superimposer

Python
85
star
32

udp-over-tcp

A command-line tool for tunneling UDP datagrams over TCP.
Rust
84
star
33

tetris-tutorial

From rags to riches; building Tetris with no programming experience.
JavaScript
78
star
34

codecrafters-bittorrent-rust

Rust
78
star
35

orst

Sorting algorithms in Rust
Rust
76
star
36

async-ssh

High-level Rust library for asynchronous SSH connections
Rust
74
star
37

codecrafters-git-rust

Rust
73
star
38

pthread_pool

A simple implementation of thread pooling for C/C++ using POSIX threads
C
71
star
39

arrav

A sentinel-based, heapless, `Vec`-like type.
Rust
68
star
40

streamsh

Download online video streams using shell
Shell
67
star
41

trawler

Workload generator that emulates the traffic pattern of lobste.rs
Rust
66
star
42

async-bincode

Asynchronous access to a bincode-encoded item stream.
Rust
66
star
43

sento

A lock-free, append-only atomic pool.
Rust
65
star
44

hashbag

An unordered multiset/bag implementation backed by HashMap
Rust
64
star
45

cucache

Fast PUT/GET/DELETE in-memory key-value store for lookaside caching
Go
63
star
46

wp2ghost

Convert WordPress XML exports to Ghost JSON import files
JavaScript
58
star
47

vast-vmap

JavaScript library for IAB VAST + VMAP
JavaScript
56
star
48

lox

https://app.codecrafters.io/courses/interpreter
Rust
55
star
49

vote.rs

Simple website for doing multi-round ranked choice voting
Rust
52
star
50

tokio-io-pool

An I/O-oriented tokio runtime thread pool
Rust
48
star
51

hurdles

Rust library providing a counter-based thread barrier
Rust
46
star
52

ordsearch

A Rust data structure for efficient lower-bound lookups
Rust
42
star
53

arccstr

Thread-safe, reference-counted null-terminated immutable Rust strings.
Rust
41
star
54

cliff

Find the load at which a benchmark falls over.
Rust
36
star
55

curb

Run a process on a particular subset of the available hardware.
Rust
35
star
56

bystander

Rust
30
star
57

shortcut

Rust crate providing an indexed, queryable column-based storage system
Rust
30
star
58

hasmail

Simple tray icon for detecting new email on IMAP servers
Go
29
star
59

ornithology

A tool that parses your Twitter archive and highlights interesting data from it.
Rust
28
star
60

rust-zipf

Rust implementation of a fast, bounded, Zipf-distributed random number generator
Rust
26
star
61

minion

Rust crate for managing cancellable services
Rust
24
star
62

cargo-index-transit

A package for common types for Cargo index interactions, and conversion between them.
Rust
24
star
63

streamunordered

A version for futures::stream::FuturesUnordered that multiplexes Streams
Rust
24
star
64

icebreaker

Web app that allows students to ask real-time, anonymous questions during class
Go
23
star
65

mktrayicon

Create system tray icons by writing to a pipe
C
22
star
66

thesquareplanet.com

My homepage
HTML
20
star
67

rust-basic-hashmap

Let's build a HashMap
Rust
20
star
68

obs-do

CLI for common OBS operations while streaming using WebSocket
Rust
20
star
69

gladia-captions

Rust
20
star
70

PHP-Browser

A PHP class to allow scripts to access online resources hidden behind login-forms and other user-interaction measures
PHP
19
star
71

keybase-chat-notifier

Simple desktop notifier for keybase chat
Rust
19
star
72

tokio-byteorder

Asynchronous adapter for byteorder
Rust
19
star
73

python-agenda

Python module for pretty task logging
Python
18
star
74

async-lease

An asynchronous leased value
Rust
17
star
75

yambar-hyprland-wses

Rust
17
star
76

tally

time's prettier cousin
Rust
15
star
77

strawpoll

Rust
15
star
78

async-option

Asynchronous Arc<Mutex<Option<T>>>
Rust
14
star
79

indexmap-amortized

bluss/IndexMap with amortized resizes
Rust
14
star
80

tokio-os-timer

Timer facilities for Tokio based on OS-level primitives.
Rust
14
star
81

cloud-arch

Script for generating Arch Linux cloud images
Shell
13
star
82

cryptsetup-gui

Simple GUI for unlocking cryptsetup volumes
C
13
star
83

guardian

Owned mutex guards for refcounted mutexes.
Rust
12
star
84

rust-at-sunrise

Tool for posting daily updates on the latest available Rust Nightly
Rust
11
star
85

experiment

A tool for running concurrent multi-configuration experiments.
Ruby
11
star
86

mio-pool

A pool of workers operating on a single set of mio connections
Rust
10
star
87

rust-agenda

Simple, pretty CLI progress tracker
Rust
10
star
88

songtext

Simple bash script for retrieving song lyrics.
Shell
10
star
89

skyline

Various implementations of tower::Service
Rust
9
star
90

hanabot

Hanabi Slack bot
Rust
8
star
91

cleaver

Data-flow distribution analyzer
Rust
8
star
92

repackage

A terrible and nerve-inducing tool to rename a crate in a .crate file
Rust
7
star
93

simio

I/O Automata Simulator
Python
7
star
94

cargo-http-bench

Benchmarking suite for Cargo's experimental HTTP registry implementation
Shell
7
star
95

go-iprof

Simple go library for concurrent instrumented profiling.
Go
7
star
96

incomplete

Provides incomplete!(), a compile-time checked version of Rust's unimplemented!() macro
Rust
7
star
97

rucache

Fast PUT/GET/DELETE in-memory key-value store for lookaside caching -- Rust prototype
Rust
6
star
98

throttled-reader

An io::Read proxy that limits calls to read()
Rust
6
star
99

ff-jump-to-tab

A (subjectively) better Ctrl+# experience for Firefox
JavaScript
6
star
100

snavi

Sane navigation enhancement with Javascript
JavaScript
6
star