• Stars
    star
    4,420
  • Rank 9,685 (Top 0.2 %)
  • Language
    OCaml
  • License
    MIT License
  • Created almost 3 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

magic-trace collects and displays high-resolution traces of what a process is doing


magic-trace

Overview

magic-trace collects and displays high-resolution traces of what a process is doing. People have used it to:

  • figure out why an application running in production handles some requests slowly while simultaneously handling a sea of uninteresting requests,
  • look at what their code is actually doing instead of what they think it's doing,
  • get a history of what their application was doing before it crashed, instead of a mere stacktrace at that final instant,
  • ...and much more!

magic-trace:

  • has 2%-10% overhead,
  • doesn't require application changes to use,
  • traces every function call with ~40ns resolution, and
  • renders a timeline of call stacks going back (a configurable) ~10ms.

You use it like perf: point it to a process and off it goes. The key difference from perf is that instead of sampling call stacks throughout time, magic-trace uses Intel Processor Trace to snapshot a ring buffer of all control flow leading up to a chosen point in time1. Then, you can explore an interactive timeline of what happened.

You can point magic-trace at a function such that when your application calls it, magic-trace takes a snapshot. Alternatively, attach it to a running process and detatch it with Ctrl+C, to see a trace of an arbitrary point in your program.

Testimonials

"Magic-trace is one of the simplest command-line debugging tools I have ever used."

  • Francis Ricci, Jane Street

"Magic-trace is not just for performance. The tool gives insight directly into what happens in your program, when, and why. Consider using it for all your introspective goals!"

  • Andrew Hunter, Jane Street

I use perf a ton, and I think that both perf and magic-trace give perspectives that the other doesn't. The benefit I got from magic-trace was entirely based on the fact that it works in slices at any zoom level, so I was able to see all the function calls that a 70ns function was performing, which was invisible in perf.

  • Doug Patti, Jane Street

more testimonials...

Install

  1. Make sure the system you want to trace is supported. The constraints that most commonly trip people up are: VMs are mostly not supported, Intel only (Skylake2 or later), Linux only.

  2. Grab a release binary from the latest release page.

    1. If downloading the prebuilt binary (not package), chmod +x magic-trace3
    2. If downloading the package, run sudo dpkg -i magic-trace*.deb

    Then, test it by running magic-trace -help, which should bring up some help text.

Getting started

  1. Here's a sample C program to try out. It's a slightly modified version of the example in man 3 dlopen. Download that, build it with gcc demo.c -ldl -o demo, then leave it running ./demo. We're going to use that program to learn how dlopen works.

  2. Run magic-trace attach -pid $(pidof demo). When you see the message that it's successfully attached, wait a couple seconds and Ctrl+C magic-trace. It will output a file called trace.fxt in your working directory.

  1. Open magic-trace.org, click "Open trace file" in the top-left-hand and give it the trace file generated in the previous step.

  1. That should have expanded into a trace. Zoom in until you can see an individual loop through dlopen/dlsym/cos/printf/dlclose.
    • W zooms into wherever your mouse cursor is pointed (you'll need to zoom in a bunch to see anything useful),
    • S zooms out,
    • A moves left,
    • D moves right, and
    • scroll wheel moves your viewport up and down the stack. You'll only need to scroll to see particularly deep stack traces, it's probably not useful for this example.

  1. Click and drag on the white space around the call stacks to measure. Plant flags by clicking in the timeline along the top. Using the measurement tool, measure how long it takes to run cos. On my screen it takes ~5.7us.

Congratulations, you just magically traced your first program!

In contrast to traditional perf workflows, magic-trace excels at hypothesis generation. For example, you might notice that taking 6us to run cos is a really long time! If you zoom in even more, you'll see that there's actually five pink "[untraced]" cells in there. If you re-run magic-trace with root and pass it -trace-include-kernel, you'll see stacktraces for those. They're page fault handlers! The demo program actually calls cos twice. If you zoom in even more near the end of the 6us cos call, you'll see that the second call takes far less time and does not page fault.

How to use it

magic-trace continuously records control flow into a ring buffer. Upon some sort of trigger, it takes a snapshot of that buffer and reconstructs call stacks.

There are two ways to take a snapshot:

We just did this one: Ctrl+C magic-trace. If magic-trace terminates without already having taken a snapshot, it takes a snapshot of the end of the program.

You can also trigger snapshots when the application calls a function. To do so, pass magic-trace the -trigger flag.

  • -trigger ? brings up a fuzzy-finding selector that lets you choose from all symbols in your executable,
  • -trigger SYMBOL selects a specific, fully mangled, symbol you know ahead of time, and
  • -trigger . selects the default symbol magic_trace_stop_indicator.

Stop indicators are powerful. Here are some ideas for where you might want to place one:

  • If you're using an asynchronous runtime, any time a scheduler cycle takes too long.
  • In a server, when a request takes a surprisingly long time.
  • After the garbage collector runs, to see what it's doing and what it interrupted.
  • After a compiler pass has completed.

You may leave the stop indicator in production code. It doesn't need to do anything in particular, magic-trace just needs the name. It is just an empty, but not inlined, function. It will cost ~10us to call, but only when magic-trace actually uses it to take a snapshot.

Documentation

More documentation is available on the magic-trace wiki.

Discussion

Join us on Discord to chat synchronously, or the GitHub discussion group to do so asynchronously.

Contributing

If you'd like to contribute:

  1. read the build instructions,
  2. set up your editor,
  3. take a quick tour through the codebase, then
  4. hit up the issue tracker for a good starter project.

Privacy policy

magic-trace does not send your code or derivatives of your code (including traces) anywhere.

magic-trace.org is a lightly modified fork of Perfetto, and runs entirely in your browser. As far as we can tell, it does not send your trace anywhere. If you're worried about that changing one day, set up your own local copy of the Perfetto UI and use that instead.

Acknowledgements

Tristan Hume is the original author of magic-trace. He wrote it while working at Jane Street, who currently maintains it.

Intel PT is the foundational technology upon which magic-trace rests. We'd like to thank the people at Intel for their years-long efforts to make it available, despite its slow uptake in the greater software community.

magic-trace would not be possible without perfs extensive support for Intel PT. perf does most of the work in interpreting Intel PT's output, and magic-trace likely wouldn't exist were it not for their efforts. Thank you, perf developers.

magic-trace.org is a fork of Perfetto, with minor modifications. We'd like to thank the people at Google responsible for it. It's a high quality codebase that solves a hard problem well.

The ideas behind magic-trace are in no way unique. We've written down a list of prior art that has influenced its design.

Footnotes

  1. perf can do this too, but that's not how most people use it. In fact, if you peek under the hood you'll see that magic-trace uses perf to drive Intel PT. ↩

  2. Strictly speaking, anything newer than Broadwell, but this is not a platform we regularly test on, and timing resolution is worse (~1us). ↩

  3. https://github.com/actions/upload-artifact/issues/38 ↩

More Repositories

1

core

Jane Street Capital's standard library overlay
OCaml
1,030
star
2

base

Standard library for OCaml
OCaml
849
star
3

incremental

A library for incremental computations
OCaml
797
star
4

hardcaml

Hardcaml is an OCaml library for designing hardware.
OCaml
558
star
5

learn-ocaml-workshop

Exercises and projects for Jane Street's OCaml Workshop
OCaml
460
star
6

incr_dom

A library for building dynamic webapps, using Js_of_ocaml.
OCaml
360
star
7

bonsai

A library for building dynamic webapps, using Js_of_ocaml
OCaml
340
star
8

ecaml

Writing Emacs plugin in OCaml
OCaml
242
star
9

core_kernel

Jane Street's standard library overlay (kernel)
OCaml
216
star
10

patdiff

File Diff using the Patience Diff algorithm. https://opensource.janestreet.com/patdiff/
OCaml
199
star
11

async

Jane Street Capital's asynchronous execution library
OCaml
183
star
12

iron

Jane Street code review system
149
star
13

vcaml

OCaml bindings for the Neovim API
OCaml
148
star
14

sexplib

Automated S-expression conversion
OCaml
142
star
15

ppx_expect

Cram like framework for OCaml
OCaml
130
star
16

ppx_inline_test

Syntax extension for writing in-line tests in ocaml code
OCaml
124
star
17

ppx_let

Monadic let-bindings
OCaml
108
star
18

pythonlib

A library to help writing wrappers around ocaml code for python
OCaml
94
star
19

install-ocaml

Instructions for setting up an OCaml development environment
OCaml
94
star
20

jenga

Build system
89
star
21

ppx_sexp_conv

Generation of S-expression conversion functions from type definitions
OCaml
78
star
22

torch

OCaml
74
star
23

bin_prot

Binary protocol generator
OCaml
68
star
24

ppx_optcomp

Optional compilation for OCaml
OCaml
62
star
25

memtrace

Streaming client for OCaml's Memprof
OCaml
61
star
26

ocaml_plugin

Automatically build and dynlink ocaml source files
OCaml
60
star
27

ppx_yojson_conv

[@@deriving] plugin to generate Yojson conversion functions
OCaml
58
star
28

async_kernel

Jane Street Capital's asynchronous execution library (core)
OCaml
57
star
29

ppx_fields_conv

Generation of accessor and iteration functions for ocaml records
OCaml
55
star
30

accessor

A library that makes it nicer to work with nested functional data structures
OCaml
51
star
31

opam-repository

Opam repository for the development version of Jane Street packages
51
star
32

virtual_dom

OCaml bindings for the virtual-dom library
OCaml
51
star
33

spawn

Spawning sub-processes
C
48
star
34

rpc_parallel

Type-safe library for building parallel applications, built on top of Async's Rpc module.
OCaml
47
star
35

incremental_kernel

Library for incremental computations depending only on Core_kernel
OCaml
45
star
36

core_bench

Micro-benchmarking library for OCaml
OCaml
44
star
37

sexp

S-expression swiss knife
OCaml
43
star
38

async_smtp

SMTP client and server
OCaml
42
star
39

ppx_variants_conv

Generation of accessor and iteration functions for ocaml variant types
OCaml
40
star
40

re2

OCaml bindings for RE2
C++
39
star
41

higher_kinded

A library with an encoding of higher kinded types in OCaml
OCaml
36
star
42

async_parallel

Distributed computing library
OCaml
35
star
43

ppx_python

[@@deriving] plugin to generate Python conversion functions
OCaml
33
star
44

stdio

Standard IO Library for OCaml
OCaml
33
star
45

parsexp

S-expression parsing library
OCaml
32
star
46

core_extended

Jane Street Capital's standard library overlay
OCaml
32
star
47

async_unix

Jane Street Capital's asynchronous execution library (unix)
OCaml
30
star
48

ocaml_intrinsics

Provides functions to invoke amd64 instructions (such as clz,popcnt,rdtsc,rdpmc) when available, or compatible software implementation on other targets.
OCaml
29
star
49

ppx_string

ppx extension for string interpolation
OCaml
28
star
50

memtrace_viewer

Interactive memory profiler based on Memtrace
OCaml
27
star
51

async_ssl

Async wrappers for ssl
OCaml
27
star
52

incr_map

Helpers for incremental operations on map like data structures.
OCaml
26
star
53

noise-wireguard-ocaml

An implementation of the Noise Protocol, intended to be used as the base for a Wireguard implementation in OCaml.
OCaml
26
star
54

ppx_enumerate

Generate a list containing all values of a finite type
OCaml
24
star
55

ppx_jane

Standard Jane Street ppx rewriters
Makefile
23
star
56

zstandard

OCaml bindings to Zstandard
OCaml
23
star
57

tracing

Tracing library
OCaml
23
star
58

typerep

Runtime types for OCaml (beta version)
OCaml
22
star
59

camlp4-to-ppx

Convert from camlp4 + syntax extensions to regular OCaml + extension points and attributes
OCaml
22
star
60

postgres_async

OCaml/async implementation of the postgres protocol (i.e., does not use C-bindings to libpq)
OCaml
22
star
61

sexp_pretty

S-expression pretty-printer
OCaml
22
star
62

ppx_custom_printf

Printf-style format-strings for user-defined string conversion
OCaml
21
star
63

ppx_compare

Generation of comparison functions from types
OCaml
21
star
64

zarith_stubs_js

Javascripts stubs for the Zarith library
OCaml
21
star
65

fieldslib

OCaml record fields as first class values
Makefile
20
star
66

patience_diff

Tool and library implementing patience diff
OCaml
20
star
67

lwt-async

Lwt with async backend
OCaml
19
star
68

bigdecimal

Arbitrary-precision decimal based on Zarith
OCaml
19
star
69

configurator

Helper library for gathering system configuration
OCaml
19
star
70

ppx_csv_conv

Generate functions to read/write records in csv format
OCaml
19
star
71

janestreet.github.com

Front page
HTML
19
star
72

textutils

OCaml
18
star
73

ocaml-compiler-libs

compiler libraries repackaged
OCaml
18
star
74

universe

Jane Street universe
OCaml
18
star
75

jsonaf

A library for parsing, manipulating, and serializing data structured as JSON.
OCaml
18
star
76

sexplib0

Library containing the definition of S-expressions and some base converters
OCaml
17
star
77

ppx_stable

Stable types conversions generator
OCaml
17
star
78

ppx_css

A ppx that takes in css strings and produces a module for accessing the unique names defined within.
OCaml
16
star
79

core_unix

Unix-specific portions of Core
OCaml
15
star
80

async_extra

Jane Street Capital's asynchronous execution library (extra)
OCaml
15
star
81

hardcaml_of_verilog

Convert Verilog to a Hardcaml design
OCaml
15
star
82

base_quickcheck

Randomized testing framework, designed for compatibility with Base
OCaml
15
star
83

ppx_type_directed_value

Get [@@deriving]-style generation of type-directed values without writing a ppx
OCaml
15
star
84

async_websocket

A library that implements the websocket protocol on top of Async
OCaml
15
star
85

hardcaml_circuits

Hardcaml Circuits
OCaml
14
star
86

redis-async

Redis client for Async applications
OCaml
14
star
87

ppx_hash

A ppx rewriter that generates hash functions from type expressions and definitions
OCaml
14
star
88

file_path

A library for typed manipulation of UNIX-style file paths.
OCaml
14
star
89

merlin-jst

Merlin with support for Jane Street extensions
OCaml
14
star
90

ppx_assert

Assert-like extension nodes that raise useful errors on failure
OCaml
14
star
91

ppx_js_style

Code style checker for Jane Street Packages
OCaml
14
star
92

core_profiler

Profiling library
OCaml
14
star
93

result

Compat result type
OCaml
14
star
94

async_js

A small library that provide Async support for JavaScript platforms
OCaml
13
star
95

topological_sort

Topological sort algorithm
OCaml
13
star
96

hardcaml_verify

Hardcaml Verification Tools
OCaml
13
star
97

ppx_log

Ppx_sexp_message-like extension nodes for lazily rendering log messages
OCaml
13
star
98

toplevel_expect_test

Toplevel expectation test
OCaml
13
star
99

line-up-words

a small cmd line tool to align words in a sequence of lines in a smart way
OCaml
13
star
100

timezone

Time-zone handling
OCaml
12
star