• Stars
    star
    2,754
  • Rank 15,874 (Top 0.4 %)
  • Language
    Rust
  • License
    MIT License
  • Created almost 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Techniques and numbers for estimating system's performance from first-principles

Napkin Math

The goal of this project is to collect software, numbers, and techniques to quickly estimate the expected performance of systems from first-principles. For example, how quickly can you read 1 GB of memory? By composing these resources you should be able to answer interesting questions like: how much storage cost should you expect to pay for logging for an application with 100,000 RPS?

The best introduction to this skill is through my talk at SRECON.

The best way to practise napkin math in the grand domain of computers is to work on your own problems. The second-best is to subscribe to this newsletter where you'll get a problem every few weeks to practise on. It should only take you a few minutes to solve each one as your facility with these techniques improve.

The archive of problems to practise with are here. The solution will be in the following newsletter.

Numbers

Below are numbers that are rounded from runs on a metal Intel Xeon E-2236 3.4GHz with 12 (virtual) cores.

Note 1: Some throughput and latency numbers don't line up, this is intentional for ease of calculations.

Note 2: Take the numbers with a grain of salt. E.g. for I/O, fio is the state-of-the-art. I am continuously updating these numbers as I learn more to improve accuracy and as hardware improves.

Operation Latency Throughput 1 MiB 1 GiB
Sequential Memory R/W (64 bytes) 0.5 ns
-- Single Thread, No SIMD 10 GiB/s 100 μs 100 ms
-- Single Thread, SIMD 20 GiB/s 50 μs 50 ms
-- Threaded, No SIMD 30 GiB/s 35 μs 35 ms
-- Threaded, SIMD 35 GiB/s 30 μs 30 ms
Hashing, not crypto-safe (64 bytes) 25 ns 2 GiB/s 500 μs 500 ms
Random Memory R/W (64 bytes) 50 ns 1 GiB/s 1 ms 1s
Fast Serialization [8] [9] N/A 1 GiB/s 1 ms 1s
Fast Deserialization [8] [9] N/A 1 GiB/s 1 ms 1s
System Call 500 ns N/A N/A N/A
Hashing, crypto-safe (64 bytes) 500 ns 200 MiB/s 10 ms 10s
Sequential SSD read (8 KiB) 1 μs 4 GiB/s 200 μs 200 ms
Context Switch [1] [2] 10 μs N/A N/A N/A
Sequential SSD write, -fsync (8KiB) 10 μs 1 GiB/s 1 ms 1s
TCP Echo Server (32 KiB) 10 μs 4 GiB/s 200 μs 200 ms
Decompression [11] N/A 1 GiB/s 1 ms 1s
Compression [11] N/A 500 MiB/s 2 ms 2s
Sequential SSD write, +fsync (8KiB) 1 ms 10 MiB/s 100 ms 2 min
Sorting (64-bit integers) N/A 200 MiB/s 5 ms 5s
Random SSD Read (8 KiB) 100 μs 70 MiB/s 15 ms 15s
Serialization [8] [9] N/A 100 MiB/s 10 ms 10s
Deserialization [8] [9] N/A 100 MiB/s 10 ms 10s
Proxy: Envoy/ProxySQL/Nginx/HAProxy 50 μs ? ? ?
Network within same region [6] 250 μs 100 MiB/s 10 ms 10s
{MySQL, Memcached, Redis, ..} Query 500 μs ? ? ?
Random HDD Read (8 KiB) 10 ms 0.7 MiB/s 2 s 30m
Network between regions [6] Varies 25 MiB/s 40 ms 40s
Network NA East <-> West 60 ms 25 MiB/s 40 ms 40s
Network EU West <-> NA East 80 ms 25 MiB/s 40 ms 40s
Network NA West <-> Singapore 180 ms 25 MiB/s 40 ms 40s
Network EU West <-> Singapore 160 ms 25 MiB/s 40 ms 40s

†: "Fast serialization/deserialization" is typically a simple wire-protocol that just dumps bytes, or a very efficient environment. Typically standard serialization such as e.g. JSON will be of the slower kind. We include both here as serialization/deserialization is a very, very broad topic with extremely different performance characteristics depending on data and implementation.

You can run this with ./run to run with the right optimization levels. You won't get the right numbers when you're compiling in debug mode. You can help this project by adding new suites and filling out the blanks.

Note: I'm currently porting the benchmarks over to Criterion.rs, so some are in bench/ now. You can run those by uncommenting the relevant line in ./run.

I am aware of some inefficiencies in this suite. I intend to improve my skills in this area, in order to ensure the numbers are the upper-bound of performance you may be able to squeeze out in production. I find it highly unlikely any of them will be more than 2-3x off, which shouldn't be a problem for most users.

Cost Numbers

Approximate numbers that should be consistent between Cloud providers.

What Amount $ / Month $ / Hour
CPU 1 $10 $0.02
Memory 1 GB $1
SSD 1 GB $0.1
Disk 1 GB $0.01
S3, GCS, .. 1 GB $0.01
Network 1 GB $0.01

Compression Ratios

This is sourced from a few sources. [3] [4] [5] Note that compression speeds (but generally not ratios) vary by an order of magnitude depending on the algorithm and the level of compression (which trades speed for compression).

I typically ballpark that another x in compression ratio decreases performance by 10x. E.g. we can get a 2x ratio on English Wikipedia at ~200 MiB/s, and 3x at ~20MiB/s, and 4x at 1MB/s.

What Compression Ratio
HTML 2-3x
English 2-4x
Source Code 2-4x
Executables 2-3x
RPC 5-10x
SSL -2% [10]

Techniques

  • Don't overcomplicate. If you are basing your calculation on more than 6 assumptions, you're likely making it harder than it should be.
  • Keep the units. They're good checksumming. Wolframalpha has terrific support if you need a hand in converting e.g. KiB to TiB.
  • Calculate with exponents. A lot of back-of-the-envelope calculations are done with just coefficients and exponents, e.g. c * 10^e. Your goal is to get within an order of magnitude right--that's just e. c matters a lot less. Only worrying about single-digit coefficients and exponents makes it much easier on a napkin (not to speak of all the zeros you avoid writing).
  • Perform Fermi decomposition. Write down things you can guess at until you can start to hint at an answer. When you want to know the cost of storage for logging, you're going to want to know how big a log line is, how many of those you have per second, what that costs, and so on.

Resources

More Repositories

1

logrus

Structured, pluggable logging for Go.
Go
23,983
star
2

zk

Zettelkasten on the command-line 📚 🔍
Ruby
542
star
3

airrecord

Ruby wrapper for Airtable, your personal database
Ruby
285
star
4

dotfiles

Personal UNIX toolbox
Shell
188
star
5

anki-airtable

Sync Anki with Airtable!
Python
170
star
6

localjob

Simple, self-contained background queue built on top of SysV message queues.
Ruby
145
star
7

posix-mqueue

Ruby wrapper for POSIX IPC message queues.
C
102
star
8

sysvmq

Ruby wrapper for SysV IPC message queues.
C
50
star
9

initcwnd

Script to analyze the initial congestion window of any https server
Ruby
47
star
10

contestrus

An open source algorithmic contest platform
Ruby
32
star
11

Mongui

A simple GUI (web interface) for Mongo using Sinatra.
Ruby
24
star
12

progressrus

Monitor the progress of remote, long-running jobs.
Ruby
23
star
13

cachedis

Cachedis caches expensive (database) queries in Redis
Ruby
21
star
14

flying-cat

A modular, Lua-based operating system project.
C
15
star
15

tivitybalancer

Chrome extension that sends you to a random productive site from a list
JavaScript
14
star
16

airtable-rs

Rust wrapper for the Airtable API
Rust
12
star
17

vim-execrus

Framework for context dependent execution of commands in Vim
Vim Script
10
star
18

informatics

Solutions to various Informatics tasks from different judges.
Roff
8
star
19

can-has-lolcat

Fetches a random lolcat, and returns the appropriate output format.
Ruby
8
star
20

toxiproxy-rails-example

Example Rails application that uses Toxiproxy for resiliency testing.
Ruby
7
star
21

Flimpl

Simple and lightweight PHP MVC framework.
PHP
4
star
22

SteakMachine

Javascript state machine for those who like their states medium rare.
JavaScript
3
star
23

skypelogs

Ruby wrapper for your Skype logs.
Ruby
3
star
24

everdown

Sync Markdown files with Evernote
Ruby
3
star
25

instapaper-rs

Rust Instapaper API wrapper.
Rust
3
star
26

tomb

Tomb mirror from launchpad.net/tomb
Go
3
star
27

godis

Go
3
star
28

kafka

Apache Kafka Ruby client built as a wrapper for librdkafka
C
2
star
29

CCPL

Different tasks, solved in various languages.
Java
2
star
30

zooconf

Manage templated configuration files from Zookeeper
2
star
31

applog

Mirror og tideland.biz' applog for Go
Go
2
star
32

h

programming language
Ruby
2
star
33

Ruby-Introduciton-course

Slides for Ruby introduction course
Ruby
2
star
34

truffle-grater

http://trufflegrater.com
CSS
1
star
35

Lurker

Irc bot for Lurking project.
Ruby
1
star
36

Mongoground

A little something I threw together in Sinatra and Mongo for testing/show off purposes.
JavaScript
1
star
37

Shouldnt-you-be-doing-something-Awesome

Shouldn't you be doing something Awesome?
Ruby
1
star
38

babushka-deps

Ruby
1
star