• Stars
    star
    377
  • Rank 113,535 (Top 3 %)
  • Language
    Rust
  • License
    MIT License
  • Created over 3 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Parse and Query log files as time series

pq - Parse and Query log files as time series

.github/workflows/rust.yml

Project is actively being developed!

Why

I often find myself staring at Nginx or Envoy access logs flooding my screens with real-time data. My only wish at that moment is to be able to aggregate these lines somehow and analyze the output at a slower pace. Ideally, with some familiar and concise query language. Something like that would do:

tail -f /var/log/nginx/access.log | \
  pq 'nginx:combined | select sum(sum_over_time(content_len{status_code="2.."}[1s])) by (method) / 1024'

How

The idea is pretty straightforward - most of the log files around are essentially time series. If we could parse an input stream into a series of structured records, we would be able to query the derived stream with PromQL-like expressions.

pq reads the input stream line by line, applies some decoding and mapping, and produces such a stream of structured records.

Simply put, pq turns lines into key-value objects (dictionaries). While keys are always strings, values can be of the following types:

  • metric (or tag) - entries with lower cardinality
  • value (or field) - entries with higher cardinality
  • timestamp - the one that makes the input stream a time series.

Having a stream of timestamped records, pq can query it with its own query language. The query language and the query execution model are highly influenced by Prometheus. The query results can be printed with one of the supported formatters (human-readable, JSON, Prometheus API) or displayed on the screen in an interactive way.

Usage

Interactive:

docker logs -f nginx | pq -i '
/[^\[]+\[([^]]+)].+?\s+"([^\s]+)[^"]*?"\s+(\d+)\s+(\d+).*/
| map { .0:ts, .1 as method, .2:str as status_code, .3 as content_len }
| select sum(count_over_time(__line__[1s])) by (method)'

For further analysis (JSON):

docker logs nginx | pq '
/[^\[]+\[([^]]+)].+?\s+"([^\s]+)[^"]*?"\s+(\d+)\s+(\d+).*/
| map { .0:ts, .1 as method, .2:str as status_code, .3 as content_len }
| select count_over_time(__line__[1s])
| to_json' > result.jsonl

You can also visualize JSON results using the simplistic plotting utility:

RPS

A better usage example is under construction... See this article for some screencasts.

Installation

For now only the following method is supported:

cargo install --git https://github.com/iximiuz/pq

It requires Cargo and Rust and should probably work on all platforms supported by Rust ecosystem.

Eventually, more installation methods will be added (brew, apt, dnf, etc).

Documentation

pq accepts a program as its only required argument. A program must start from a decoder clause that can be followed by a mapper clause, and then by a query clause. Also, an optional formatter can be applied at the end:

pq '<decoder>'
pq '<decoder> | <formatter>'
pq '<decoder> | map <mapper>'
pq '<decoder> | select <query>'
pq '<decoder> | map <mapper> | select <query>'
pq '<decoder> | map <mapper> | select <query> | <formatter>'

Decoders

Currently supported input decoders:

  • regex /.../ - uses a regex with match groups to split lines on fields
  • JSON json - expects a JSONL input stream

Coming soon decoders:

  • CSV
  • logfmt (aka scanf)
  • Prometheus
  • InfluxDB
  • Nginx
  • Apache
  • Envoy
  • etc...

Mappers

The result of decoding is a stream of raw entries. Depending on the decoder and the input stream, an entry can be a tuple or a dictionary. The following syntax is used to map an entry to a full-fledged record that can be then used at the query stage.

...for a tuple entry:

<decoder> | map { .0, .1, .3  }          // pick up first, second, and forth elements of a tuple
                                         // produces the following object: { f0: <val>, f1: <val>, f3: <val> }

<decoder> | map { .0 as foo, .1 as bar } // produces object { foo: <val>, bar: <val> }

...for a dictionary entry:

<decoder> | map { .foo, .bar  }  // filters out all other fields
                                 // produces the following object: { foo: <val>, bar: <val> }

<decoder> | map { .foo as qux }  // produces object { qux: <foo's val> }

Record fields are strongly typed. Entry fields may or may not be typed. Appending a field name with :str, :num, or :ts applies dynamic type casting:

<decoder> | map { .foo:str as qux, .bar:num as abc, .baz:ts }

The timestamp type also supports an optional format specifier: :ts [optional format like "%Y-%m-%d"]. If the format of a timestamp field is not provided, pq will try its best to guess the format based on the input samples.

Query language

The query language is heavily influenced by PromQL. Hopefully existing PromQL skills should be totally transferable.

Normally, a query starts from a metric selector:

  • body_bytes - matches all records with the body_bytes metric.
  • body_bytes{method="GET"} - takes only GET requests.
  • body_bytes{method!="GET", status_code~="5.."} - takes failed non-GET requests.

A query is executed with a given frequency (by default 1 sec) and a selector returns the latest closest sample from the stream. To get multiple samples, a time duration can be added:

  • body_bytes[1s] - returns secondly buckets of samples
  • body_bytes{status_code!="200"}[1h30m15s5ms] - returns all non-200 records for the past ~1h30m.

An operator or a function can be applied to a selector.

Supported operators:

  • arithmetic + - / * ^ %: body_bytes{method="GET"} + body_bytes{method="POST"} or body_bytes{} / 1024
  • comparison: == != <= < >= >: body_bytes{} > 1000
  • aggregation avg() bottomk() count() group() man() min() sum() topk(): min(body_bytes)
  • coming soon - more aggregations quantile() stderr() stdvar()
  • coming soon - logical and unless or

Supported functions:

  • avg_over_time(selector[duration])
  • count_over_time(selector[duration])
  • last_over_time(selector[duration])
  • min_over_time(selector[duration])
  • max_over_time(selector[duration])
  • sum_over_time(selector[duration])
  • coming soon - other well-known functions...

And most of the expressions can be combined. Ex:

sum(sum_over_time(content_len[1s])) by (method) / 1024

Formatters

Currently supported output formatters:

  • human-readable (implicit, used by default)
  • JSON to_json
  • Prometheus API-like to_promapi
  • interactive via -i flag.

Coming soon formatters:

  • PromQL

Command-line flags and options

pq also accepts some optional command-line flags and named arguments:

FLAGS:
    -i, --interactive
    -v, --verbose

OPTIONS:
    -I, --interval <interval>  # same meaning as in Prometheus
    -b, --lookback <lookback>  # same meaning as in Prometheus
    -s, --since <since>
    -u, --until <until>

Interactive Mode Demo

The stage consists of a web server and some number of concurrent clients generating the traffic.

# Launch a test web server.
docker run -p 55055:80 --rm --name test_server nginx

# In another terminal, start pouring some well-known but diverse traffic.
# Notice, `-q` means Query Rate and `-c` means multiplier.
hey -n 1000000 -q 80 -c 2 -m GET http://localhost:55055/ &
hey -n 1000000 -q 60 -c 2 -m GET http://localhost:55055/qux &
hey -n 1000000 -q 40 -c 2 -m POST http://localhost:55055/ &
hey -n 1000000 -q 20 -c 2 -m PUT http://localhost:55055/foob &
hey -n 1000000 -q 10 -c 2 -m PATCH http://localhost:55055/ &

Access log in the first terminal looks impossible to analyze in real-time, right? Interactive pq mode to the rescue!

Secondly HTTP request rate with by (method, status_code) breakdowns

docker logs -n 1000 -f test_server | \
    pq '/[^\[]+\[([^]]+)]\s+"([^\s]+)[^"]*?"\s+(\d+)\s+(\d+).*/
        | map { .0:ts, .1 as method, .2:str as status_code, .3 as content_len } 
        | select count_over_time(__line__[1s])' \
    -i

RPS

Secondly traffic (in KB/s) aggregated by method

Slightly more advanced query - use aggregation by HTTP method only:

docker logs -n 1000 -f test_server | \
    pq '/[^\[]+\[([^]]+)]\s+"([^\s]+)[^"]*?"\s+(\d+)\s+(\d+).*/
        | map { .0:ts, .1 as method, .2:str as status_code, .3 as content_len } 
        | sum(sum_over_time(content_len[1s])) by (method) / 1024' \
    -i

BPS

For more use cases, see tests/scenarios folder.

Development

Contribution is always welcome!

# Build it with
make

# Test it with
make test-all
make test-e2e

# Run a certain e2e test
E2E_CASE=vector_matching_one_to_one_010 make test-e2e

Glossary

  • Time Series - a stream of timestamped values, aka samples sharing the same metric name and, optionally, the same set of labels (i.e. a unique combination of key-value pairs).
  • Metric name - a human-readable name of a measurement. E.g. http_requests_total, content_length, etc).
  • Metric type - counter, gauge, histogram, and summary.
  • Label - a dimension of the measurement. E.g. method, url, etc.
  • Sample - aka data point - a (value, timestamp) tuple. Value is always float64 and timestamp is always with millisecond precision.
  • Instant vector - a type of expression evaluation - a set of time series (vector) containing a single sample for each time series, all sharing the same timestamp.
  • Range vector - a type of expression evaluation - a set of time series containing a range of data points over time for each time series.
  • Scalar and string - two other expression evaluation results.
  • Vector selector - expression of a form <metric_name>[{label1=value1[, label2=value2, ...]}][[time_duration]].

More Repositories

1

cdebug

cdebug - a swiss army knife of container debugging
Go
1,310
star
2

client-go-examples

Collection of mini-programs demonstrating Kubernetes client-go usage.
Go
1,012
star
3

kexp

k'exp - Kubernetes Explorer
TypeScript
798
star
4

docker-to-linux

Make bootable Linux disk image (ab)using Docker
Makefile
658
star
5

awesome-container-tinkering

List of awesome tools to tinker with containers.
582
star
6

labctl

iximiuz Labs control - start remote microVM playgrounds from the command line.
Go
181
star
7

conman

Simplistic container manager
Go
110
star
8

flask-gevent-tutorial

How to use Flask with gevent (uWSGI and Gunicorn editions)
Python
93
star
9

simple-event-loop

Fully-functional Event Loop in 100 Lines of Python
Python
62
star
10

prometheus-playground

Prometheus Playground
Python
47
star
11

shimmy

Simplistic container runtime shim
Rust
39
star
12

ptyme

Simplest possible example of PTY usage
C
33
star
13

goimagego

goimagego - work with container images in Go (showcase)
Go
22
star
14

golife.c

Conway's Game of Life written in C and compiled to WebAssembly
C
20
star
15

envoy-playground

Container-based playground showing various capabilities of service proxy pattern.
Makefile
18
star
16

node-diskusage-ng

Get disk usage info in pure JS via OS built-in tools
JavaScript
17
star
17

wsmux

wsmux - a simple WebSocket tunnel server
Go
17
star
18

reapme

Playing with Linux processes termination in Rust
Rust
15
star
19

nim-2048

The 2048 game implementation in nim
Nim
14
star
20

producer-consumer-vis

Producer-consumer problem visualization
JavaScript
13
star
21

popen2

popen2() - bidirectional popen()
C
12
star
22

cancharts

Lightweight HTML5 canvas charts
JavaScript
11
star
23

readlines-ng

Tiny and fast async generator to read streams line by line
JavaScript
7
star
24

nom-parser-example

nom parser example
Rust
5
star
25

algorithms

Collection of different kind of algorithms
Scheme
4
star
26

todolist

A simple client-server TODO app
Go
2
star
27

multi_reader.rs

MultiReader - a composite reader implementation
Rust
2
star
28

echelon0

First echelon of logs analysis
Rust
2
star
29

js-itertools

Attempt to port Python's itertools to JavaScript
JavaScript
2
star
30

x86emu-py

Python
1
star
31

dotfiles

My dotfiles
Vim Script
1
star
32

CareersAnalyzer-web

CSS
1
star
33

node-express-ex-boilerplate

ExpressJS App Extended Boilerplate
JavaScript
1
star
34

CareersAnalyzer-grabber

Python
1
star