• Stars
    star
    878
  • Rank 51,998 (Top 2 %)
  • Language
    Go
  • License
    MIT License
  • Created about 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

fgtrace is an experimental profiler/tracer that is capturing wallclock timelines for each goroutine. It's very similar to the Chrome profiler.

fgtrace - The Full Go Tracer

go-recipes ci test status documentation

fgtrace is an experimental profiler/tracer that is capturing wallclock timelines for each goroutine. It's very similar to the Chrome profiler.

⚠️ fgtrace may cause noticeable stop-the-world pauses in your applications. It is intended for dev and testing environments for now.

Quick Start

To capture an fgtrace of your program, simply add the one-liner shown below. This will cause the creation of a fgtrace.json file in the current working directory that you can view by opening it in the Perfetto UI.

package main

import "github.com/felixge/fgtrace"

func main() {
	defer fgtrace.Config{Dst: fgtrace.File("fgtrace.json")}.Trace().Stop()

	// <code to trace>
}

Alternatively you can configure fgtrace as a http.Handler and request traces on-demand by hitting http://localhost:1234/debug/fgtrace?seconds=30&hz=100.

package main

import (
	"net/http"
	"github.com/felixge/fgtrace"
)

func main() {
	http.DefaultServeMux.Handle("/debug/fgtrace", fgtrace.Config{})
	http.ListenAndServe(":1234", nil)
}

For more advanced use cases, have a look at the API Documentation.

Comparison with Similar Tools

Below is a simple program that spends its time sleeping, requesting a website, capturing the response body and then hashing it a few times.

for i := 0; i < 10; i++ {
	time.Sleep(10 * time.Millisecond)
}

res, err := http.Get("https://github.com/")
if err != nil {
	panic(err)
}
defer res.Body.Close()

var buf bytes.Buffer
if _, err := io.Copy(&buf, res.Body); err != nil {
	panic(err)
}

for i := 0; i < 1000; i++ {
	sha1.Sum(buf.Bytes())
}

Now let's have a look at how fgtrace and other tools allow you to understand the performance of such a program.

fgtrace

Looking at our main goroutine (G1), we can easily recognize the operations of the program, their order, and how long they are taking (~100ms time.Sleep, ~65ms http.Get, ~30ms io.Copying the response and ~300ms calling sha1.Sum to hash it).

However, it's important to note that this data is captured by sampling goroutine stack traces rather than actual tracing. Therefore fgtrace does not know that there were ten time.Sleep() function calls lasting 10ms each. Instead it just merges its samples into one big time.Sleep() call that appears to take 100ms.

Another detail are the virtual goroutine state indicators on top, e.g. sleep, select, sync.Cond.Wait and running/runnable. These are not part of the real stack traces and meant to help understanding On-CPU activity (running/runnable) vs Off-CPU states. You can disable them via configuration.

To break down the latency of our main goroutine, we can also look at other goroutines used by the program. E.g. below is a closer look on how the http.Get operation is broken down into resolving the IP address, connecting to it, and performing a TLS handshake.

So as you can see, fgtrace offers an intuitive, yet powerful way to understand the operation of Go programs. However, since it always captures the activity of all goroutines and has no information about how they communicate with each other, it may create overwhelming amounts of data in some cases.

fgprof

You can think of fgprof as a more simplified version of fgtrace. Instead of capturing a timeline for each goroutine, it aggregates the same data into a single profile as shown in the flame graph below.

This means that the x-axis represents duration rather than time, so function calls are ordered alphabetically rather than chronologically. E.g. notice how time.Sleep is shown after sha1.Sum in the graph above even so it's the first operation completed by our program.

Additionally the data of all goroutines ends up in the same graph which can be difficult to read without having a good understanding of the underlaying code and number of goroutines that are involved.

Despite these disadvantages, fgprof may still be useful in certain situations where the detail provided by the timeline may be overwhelming and a simpler view of the average program behavior is desirable. Additionally fgprof under Go 1.19 has less negative impact on the performance of the profiled program than fgtrace.

runtime/trace

The runtime/trace package is a true execution tracer that is capable of capturing even more detailed information than fgtrace. However, it's mostly designed to understand the decisions made by the Go scheduler. So the default timeline is focused on how goroutines are scheduled onto the CPU (processors). This means only the sha1.Sum operation stands out in green, and full stack traces can only be seen by clicking on the individual scheduler activities.

The goroutine analysis view offers a more useful breakdown. Here we can see that our goroutine is spending 271ms in Execution on CPU, but it's not clear from this view alone that this is the sha1.Sum operation. Our networking activity (http.Get and io.Copy) gets grouped into Sync block rather than Network wait because the networking is done through channels via other goroutines. And our time.Sleep activity is shown as a grey component of the bar diagram, but not explicitly listed in the table. So while a lot of information is available here, it's difficult to interpret for casual users.

Last but not least it's possible to click on the goroutine id in the view above in order to see a timeline for the individual goroutine, as well as the other goroutines it is communicating with. However, the view is also CPU-centric, so remains difficult to understand the sleep and networking operations of our program.

That being said, some of the limitations of runtime/trace could probably be resolved with changes to the UI (see gotraceui) or converting the traces into a format that Perfetto UI can understand which might be a fun project for another time.

How it Works

The current implementation of fgtrace is incredibly hacky. It calls runtime.Stack() on a regular frequency (default 100 Hz) to capture textual stack traces of all goroutines and parses them using the gostackparse package. Each call to runtime.Stack() is a blocking stop-the-world operation, so it scales very poorly to programs using ten thousand or more goroutines.

After the data is captured, it is converted into the Trace Event Format which is one of the data formats understood by Perfetto UI.

The Future

fgtrace is mostly a "Do Things that Don't Scale" kind of project. If enough people like it, it will motivate me and perhaps others to invest into putting it on a solid technical foundation.

The Go team has previously declined the idea of adding wallclock profiling capabilities similar to fgprof (which is similar to fgtrace) to the Go project and is more likely to invest in runtime/trace going forward.

That being said, I still think fgtrace can help by:

  1. Showing the usefulness of stack-trace/wallclock focused timeline views in addition to the CPU-centric views used by runtime/trace.
  2. Starting a conversation (link to GH issue will follow ...) to offer more powerful goroutine profiling APIs to allow user-space tooling like this to thrive without having to hack around the existing APIs while reducing their overhead.

License

fgtrace is licensed under the MIT License.

More Repositories

1

node-style-guide

A guide for styling your node.js / JavaScript code. Fork & adjust to your taste.
JavaScript
4,950
star
2

fgprof

🚀 fgprof is a sampling Go profiler that allows you to analyze On-CPU as well as Off-CPU (e.g. I/O) time together.
Go
2,469
star
3

node-ar-drone

A node.js client for controlling Parrot AR Drone 2.0 quad-copters.
JavaScript
1,755
star
4

node-dateformat

A node.js package for Steven Levithan's excellent dateFormat() function.
JavaScript
1,297
star
5

node-memory-leak-tutorial

A tutorial for debugging memory leaks in node
JavaScript
909
star
6

httpsnoop

Package httpsnoop provides an easy way to capture http related metrics (i.e. response time, bytes written, and http status code) from your application's http.Handlers.
Go
891
star
7

faster-than-c

Talk outline: Faster than C? Parsing binary data in JavaScript.
JavaScript
836
star
8

node-dirty

A tiny & fast key value store with append-only disk log. Ideal for apps with < 1 million records.
JavaScript
625
star
9

node-stack-trace

Get v8 stack traces as an array of CallSite objects.
JavaScript
449
star
10

nodeguide.com

My unofficial and opinionated guide to node.js.
CSS
371
star
11

node-couchdb

A new CouchDB module following node.js idioms
JavaScript
364
star
12

sqlbench

sqlbench measures and compares the execution time of one or more SQL queries.
Go
361
star
13

node-sandboxed-module

A sandboxed node.js module loader that lets you inject dependencies into your modules.
JavaScript
344
star
14

node-require-all

An easy way to require all files within a directory.
JavaScript
300
star
15

tcpkeepalive

Go package tcpkeepalive implements additional TCP keepalive control beyond what is currently offered by the net pkg.
Go
238
star
16

node-paperboy

A node.js module for delivering static files.
JavaScript
234
star
17

godrone

GoDrone is a free software alternative firmware for the Parrot AR Drone 2.0.
Go
204
star
18

node-romulus

Building static empires with node.js.
JavaScript
157
star
19

node-gently

A node.js module that helps with stubbing and behavior verification.
JavaScript
142
star
20

node-combined-stream

A stream that emits multiple other streams one after another.
JavaScript
142
star
21

cakephp-authsome

Auth for people who hate the Auth component
PHP
123
star
22

pprofutils

Go
122
star
23

node-growing-file

A readable file stream for files that are growing.
JavaScript
106
star
24

node-graphite

A node.js client for graphite.
JavaScript
105
star
25

node-cross-compiler

Simplified cross compiling for node.js using vagrant.
Shell
105
star
26

pidctrl

A PID controller implementation in Golang.
Go
96
star
27

node-m3u

A node.js module for creating m3u / m3u8 files.
JavaScript
89
star
28

debuggable-scraps

MIT licensed code without warranty ; )
PHP
79
star
29

traceutils

Code for decoding and encoding runtime/trace files as well as useful functionality implemented on top.
Go
62
star
30

node-delayed-stream

Buffers events from a stream until you are ready to handle them.
JavaScript
56
star
31

go-redis

A redis implementation written in Go.
Go
53
star
32

nodelog

A node.js irc bot that logs a channel
JavaScript
49
star
33

flame-explain

A PostgreSQL EXPLAIN ANALYZE visualizer with advanced quirk correction algorithms.
TypeScript
46
star
34

node-stream-cache

A simple way to cache and replay readable streams.
JavaScript
45
star
35

node-utest

The minimal unit testing library.
JavaScript
42
star
36

go-cpu-utilization

Go
39
star
37

go-xxd

The history of this repo demonstrates how to take a slow xxd implementation in Go, and make it faster than the native version on OSX/Linux.
Go
38
star
38

vim-nodejs-errorformat

Vim Script
36
star
39

tweets

C
35
star
40

go-ardrone

Parrot AR Drone 2.0 drivers and protocols written in Go.
Go
33
star
41

dotfiles

My setup. Pick what you like.
Lua
31
star
42

node-buffy

A module to read / write binary data and streams.
JavaScript
31
star
43

node-urun

The minimal test runner.
JavaScript
31
star
44

node-multipart-parser

A fast and streaming multipart parser.
JavaScript
30
star
45

node-require-like

Generates require functions that act as if they were operating in a given path.
JavaScript
29
star
46

benchmore

Go
28
star
47

node-nix

Node.js bindings for non-portable *nix functions
JavaScript
28
star
48

node-fake

Test one thing at a time, fake the rest.
JavaScript
28
star
49

node-bash

Utilities for using bash from node.js.
JavaScript
25
star
50

gounwind

Experimental go stack unwinding using frame pointers.
Go
25
star
51

node-microtest

Unit testing done right.
JavaScript
23
star
52

pgmigrate

pgmigrate implements a minimalistic migration library for postgres.
Go
22
star
53

node-comment

Proof of concept - Long polling message queue with CouchDB for persistence.
JavaScript
21
star
54

node-ugly

A hack so unbelievably ugly, yet so hard to resist
JavaScript
20
star
55

advent-2021

Advent of Go Profiling 2021.
Go
19
star
56

open-source-contribution-guide

A guide for anybody interested in contribution to my open source projects.
18
star
57

go-patch-overlay

WIP
Go
17
star
58

node-channel

A general purpose comet server written in node.js
JavaScript
16
star
59

node-active-x-obfuscator

A module to (safely) obfuscate all occurrences of the string 'ActiveX' inside any JavaScript code.
JavaScript
16
star
60

gotraceanalyzer

Command gotraceanalyzer turns golang tracebacks into useful summaries.
Go
14
star
61

go-observability-bench

Measure the overheads of various observability tools, especially profilers.
Jupyter Notebook
14
star
62

rebel-resize

Dynamic image resizing server written during my web rebels 2012 live coding.
JavaScript
13
star
63

node-fast-or-slow

Are your tests fast or slow? A pragmatic testing framework.
JavaScript
13
star
64

cl

Quickly clone git repositories into a nested folders like GOPATH.
Go
13
star
65

node-lazy-socket

A stateless socket that always lets you write().
JavaScript
13
star
66

raleigh-workshop-08

Code repository for the Raleigh, NC CakePHP workshop
PHP
12
star
67

node-deferred

Dojo deferreds as a nodejs module - Work in Progress
JavaScript
12
star
68

node-oop

Simple & light-weight oop.
JavaScript
11
star
69

node-win-iap

Verifies windows store receipts.
JavaScript
10
star
70

goardronefirmware

Open source firmware for the Parrot AR Drone 2.0 written in Go.
Go
10
star
71

node-far

https://github.com/felixge/node-far
JavaScript
10
star
72

node-convert-example

Node.js image resizing demo. One version with and one version without in-memory caching.
10
star
73

couchdb-benchmarks

some benchmark scripts for testing CouchDB performance
PHP
10
star
74

node-socketio-benchmark

A WebSocket / LongPolling simulation to estimate users / core
JavaScript
9
star
75

gpac

Mirror of https://gpac.svn.sourceforge.net/svnroot/gpac/trunk/gpac + my patches
C
9
star
76

node-passthrough-stream

An example of a passthrough stream for node.js
JavaScript
9
star
77

node-http-recorder

A little tool to record and replay http requests.
JavaScript
9
star
78

node-cluster-isolatable

Isolate workers so they only handle one request at a time. Useful for file uploads.
JavaScript
8
star
79

nodecopter-ssh-tunnel

Bash scripts for controlling an AR Drone over the internet via ssh tunneling.
Shell
8
star
80

makefs

WIP - come back later.
Go
8
star
81

node-unicode-sanitize

JavaScript
8
star
82

felixge.de

My site and blog.
HTML
7
star
83

dump

A code dump of things not worth putting into their own repo.
Go
7
star
84

ooti

A kickass test suite for node.js
JavaScript
6
star
85

go-cgo-finalizer

Demonstrates using runtime.SetFinalizer to free cgo memory allocations.
Go
6
star
86

focus-app

Helps you focus by hiding all your windows except the ones you are currently working in.
Objective-C
6
star
87

gopg

Go
5
star
88

isalphanumeric

A small arm64 SIMD adventure for gophers.
Go
5
star
89

dd-trace-go-demo

A simple application to show how to use dd-trace-go's tracer and profiler.
Go
5
star
90

profiler-simulator

Go
5
star
91

talks

Source and slides for my presentations.
PLpgSQL
5
star
92

node-redis-pool

A simple node.js redis pool.
JavaScript
5
star
93

countermap

Go
5
star
94

pprof-breakdown

Go
5
star
95

proftest

proftest is a C application for testing the quality of different operating system APIs for profiling.
C
5
star
96

s3.sh

Bash functions for Amazon S3. (Not complete, just scratching my itch)
Shell
5
star
97

can

Nothing to see here yet.
Go
4
star
98

js-robocom

A robocom inspired programming game for JavaScript
JavaScript
4
star
99

log

nothing to see here yet
Go
4
star
100

dd-prof-upload

Go
4
star