• Stars
    star
    115
  • Rank 305,916 (Top 7 %)
  • Language
    JavaScript
  • License
    Other
  • Created almost 13 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Binary data pattern matching

Byte-wise matching for Node.JS

Build Status

Gives a compact syntax for parsing and constructing byte buffers, derived from Erlang's bit syntax.

var bitsyntax = require('bitsyntax');
var pattern = bitsyntax.matcher('len:8/integer, str:len/binary');
var bound = pattern(new Buffer([4, 0x41, 0x42, 0x43, 0x44]));
bound.str
// => <Buffer 41 42 43 44>

A typical use of this is parsing byte streams from sockets. For example, size-prefixed frames:

var framePattern = bitsyntax.matcher('len:32/integer, frame:len/binary, rest/binary');
socket.on('data', function process(data) {
  var m;
  if (m = framePattern(data)) {
    emit('frame', m.frame);
    process(m.rest);
  }
  else {
    stashForNextData(data);
  }
});

Patterns can also be used to construct byte buffers from supplied values:

var spdyDataFrame = require('bitsyntax')
  .builder('streamId:32, flags:8, length:24, data/binary');

spdyDataFrame({streamId:5, flags:0, length:bin.length, data:bin});

One or more segments of a pattern may also be supplied in multiple arguments, if that is more convenient; this makes it easier to split a long pattern over lines:

var p = bitsyntax.matcher('size:8, payload:size/binary',
                          'rest/binary');

API

matcher

Compiles a pattern as a string (or strings), to a function that will return either a map of bindings, or false, given a buffer and optionally an environment. The environment contains values for bound variables in the pattern (if there are any).

var p = bitsyntax.matcher('header:headerSize/binary, rest/binary');
var b = p(new Buffer([1, 2, 3, 4, 5]), {headerSize: 3});
b.header
// => <Buffer 01 02 03>

A matcher will return false if the supplied buffer does not match the pattern; for example, if it has too few bytes, or a literal is not present.

var p = bitsyntax.matcher('"foo=", str/binary');
p(new Buffer("bar=humbug"));
// => false

parse and match

When composed, equivalent to matcher; may be useful if you want to examine the internal structure of patterns.

parse takes strings as for matcher, and returns the internal representation of the pattern. match takes this representation, a buffer, and optionally an environment, and returns the bindings or false (as with matcher).

var p = bitsyntax.parse('header:headerSize/binary',
                        'rest/binary');
var b = bitsyntax.match(p, new Buffer([1, 2, 3, 4, 5]),
                          {headerSize: 3});
b.header
// => <Buffer 01 02 03>

builder

Takes a pattern and returns a function that will construct a byte buffer, given values for the variables mentioned in the pattern.

var cons = bitsyntax.builder('size:8, bin/binary');
cons({size:6, bin:new Buffer('foobar')});
// => <Buffer 06 66 6f 6f 62 61 72>

Patterns supplied to builders are slightly different to patterns supplied for matching, as noted below.

build

Takes a parsed pattern and a map of variable values, and returns a buffer. As with match, may be useful to debug patterns.

var pattern = bitsyntax.parse('size:8, bin:size/binary');
bitsyntax.build(pattern, {size:6, bin: new Buffer('foobar')});
// => <Buffer 06 66 6f 6f 62 61 72>

write

Writes variable values into a buffer, at an offset, according to the parsed pattern given. Returns the finishing offset, i.e., the supplied offset plus the number of bytes written.

var pattern = bitsyntax.parse('size:8, bin/binary');
var buf = new Buffer(7);
bitsyntax.write(buf, 0, pattern,
                {size:6, bin: new Buffer('foobar')});
// => 7
buf
// => <Buffer 06 66 6f 6f 62 61 72>

Patterns

Patterns are sequences of segments, each matching a value. Segments have the general form

 value:size/type_specifier_list

The size and type specifier list may be omitted, giving three extra variations:

value
value:size
value/type_specifier_list

The type specifier list is a list of keywords separated by hyphens. Type specifiers are described below.

Patterns are generally supplied as strings, with a comma-separated series of segments.

Variable or value

The first part of a segment gives a variable name or a literal value. If a variable name is given, the value matched by the segment will be bound to that variable name for the rest of the pattern. If a literal value is given, the matched value must equal that value. If a variable's value is given in the environment, the matched value must equal the provided value.

When used in a builder, the literal value will be copied into the buffer according to the type it is given. A variable name indicates a slot into which a value supplied to the builder will be copied.

The special variable name _ discards the value matched; i.e., it simply skips over the appropriate number of bits in the input. '_' is not allowed in builder patterns.

Size and unit

The size of a segment is given following the value or variable, separated with a colon:

foo:32

The unit is given in the list of specifiers as `'unit' and an integer from 0..256, separated by a colon:

foo:4/integer-unit:8

The size is the number of units in the value; the unit is given as a number of bits. Unit can be of use, for example, when you want to match integers of a number of bytes rather than a number of bits.

For integers and floats, the default unit is 1 bit; to keep things aligned on byte boundaries, unit * size must currently be a multiple of 8. For binaries the default unit is 8, and the unit must be a multiple of 8.

If the size is omitted and the type is integer, the size defaults to 8. If the size is omitted and the type is binary, the segment will match all remaining bytes in the input; such a segment may only be used at the end of a pattern, when matching.

The size may also be given as an integer variable matched earlier in the pattern, as in the example given at the top. When constructing, a size may be a variable referring to the supplied environment.

In builders, numbers will be rounded, masked or padded to fit the size and units given; for example, '256:8' gives the binary Buffer<00> because the lowest eight bits are 0; '255:16 gives the binary Buffer<00 ff>.

Type name specifier

One of integer, binary, string, float. If not given, the default is integer.

An integer is a big- or little-endian, signed or unsigned integer. Integers up to 32 bits are supported. Signed integers are two's complement format. In JavaScript, only integers between -(2^53) and 2^53 can be represented, and bitwise operators are only defined on 32-bit signed integers.

A binary is simply a byte buffer; usually this will result in a slice of the input buffer being returned, so beware mutation.

A string is a UTF8 string consisting of the given number of bytes.

A float is a 32- or 64-bit IEEE754 floating-point value (this is the standard JavaScript uses, as do Java and Erlang).

Endianness specifier

Integers may be big- or little-endian; this refers to which 'end' of the bytes making up the integer are most significant. In network protocols integers are usually big-endian, meaning the first (left-most) byte is the most significant, but this is not always the case.

A specifier of big means the integer will be parsed (or written into the result) as big-endian, and little means the integer will be parsed or written as little-endian. The default is big-endian.

Signedness specifier

Integer segments may include a specifier of signed or unsigned. A signed integer is parsed as two's complement format. The default is unsigned.

Signedness is ignored in builders.

Literal strings

A quoted string appearing in a pattern is a shorthand for the bytes in its UTF8 encoding. For example,

"foobar", _/binary

matches any buffer that starts with the bytes 0x66, 0x6f, 0x6f, 0x62, 0x61, 0x72.

When used in a builder, a quoted string is copied into the result as the bytes of its UTF8 encoding.

Examples

In the following the matched bytes are given in array notation for convenience. Bear in mind that match() actually takes a buffer for the bytes to match against. The phrase "returns X as Y" or "binds X as Y" means the return value is an object with value X mapped to the key Y.

54

Matches the single byte 54.

54:32

Matches the bytes [0,0,0,54].

54:32/little

Matches the bytes [54,0,0,0].

54:4/unit:8

Matches the bytes [0,0,0,54].

int:32/signed

Matches a binary of four bytes, and returns a signed 32-bit integer as int.

len:16, str:len/binary

Matches a binary of 2 + len bytes, and returns an unsigned 16-bit integer as len and a buffer of length len as str.

len:16, _:len/binary, rest/binary

Matches a binary of at least 2 + len bytes, binds an unsigned 16-bit integer as len, ignores the next len bytes, and binds the remaining (possibly zero-length) binary as rest.

s:8, key:s/binary, value/binary

When given the environment {s:6, key: "foobar"}, will match a binary starting with [6, 0x66, 0x6f, 0x6f, 0x62, 0x61, 0x72, ...].

More Repositories

1

rabbit.js

Messaging in Node.JS made easy using RabbitMQ
JavaScript
1,521
star
2

rabbitmq-lvc-plugin

A plugin exchange type for RabbitMQ that acts as a last-value-cache
Erlang
43
star
3

rabbitmq-streams

Data streams management using RabbitMQ. OF ARCHAEOLOGICAL INTEREST ONLY
Python
38
star
4

lisp-in-small-pieces

Code written while working through the book LISP in Small Pieces
Scheme
27
star
5

erlmqtt

MQTT library for Erlang *JUST USABLE*
Erlang
26
star
6

dolt

Super REPL
JavaScript
16
star
7

kubeyaml

Small program for updating kube yamels in-place
Python
15
star
8

swimrabbit

Message transformation and filtering with RabbitMQ.
Java
10
star
9

snrub

PuSH subscriber for Node.JS
JavaScript
9
star
10

spresm

It's a tool for managing Kubernetes configurations, that supports getting YAMLs from Helm charts and container images.
Go
8
star
11

node.js.node

epmd and distribution protocol for node.js.
JavaScript
8
star
12

smackx-pubsub

[RETIRED -- now in SVN trunk] XEP-0060 extension to the Smack XMPP client library
Java
7
star
13

rejson

JSON pattern matching
JavaScript
7
star
14

scheme-in-js

Messing around with LiSP-like interpreters.
JavaScript
6
star
15

mqtt-broker

Async non-blocking mqtt server on top of Netty
Clojure
6
star
16

js-pmd

Multiple dispatch for JavaScript (toy implementation)
JavaScript
6
star
17

ponk

Avoid reversal glitching when dereking the batpaddle
JavaScript
5
star
18

weavedns

Moved into weave repo https://github.com/zettio/weave (was "DNS for weave networks")
Go
5
star
19

byol

Build your own Lisp
C
5
star
20

msgpack-codec-nodejs

The msgpack codec, in Node.JS JavaScript, using require('bitsyntax').
JavaScript
5
star
21

transitive

Implementation of hop for futzing around purposes
Clojure
4
star
22

node-spb

Size-prefixed blob streams for Node.JS - OLD EXPERIMENT
JavaScript
4
star
23

sockjs-bridge

Bridge for proxying SockJS connections into some other network
JavaScript
4
star
24

sweem

Polyadic streams and the operators that operate on them
JavaScript
3
star
25

jsonspace

A tuple space but with JSON
Erlang
3
star
26

spdy-tools

Tools for SPDY
JavaScript
2
star
27

starling

Like starlings in a murmuration, not budgies in a cage
Go
2
star
28

flux-plus-pulumi

Demo fo using Flux with the Pulumi operator
JavaScript
2
star
29

flumux

Tool for comparing git, image repositories, and running (kubernetes) systems
Go
2
star
30

timetracker3

The third in the popular series
JavaScript
2
star
31

scheme-in-erl

Every day a new Scheme interpreter
Erlang
2
star
32

cloudfoundry-snrub-demo

Demo of PuSH client, snrub
JavaScript
2
star
33

sky-pony

The wings will snag on the horn
JavaScript
2
star
34

erlamqp

AMQP 1.0 library for Erlang (working on the codec thus far)
Erlang
2
star
35

varnish-prometheus-exporter-docker

My own brew of a Docker image that runs the prometheus-varnish-exporter
Dockerfile
2
star
36

pulumi-ociregistry-provider

Pulumi provider for dealing with OCI (image) registries.
Go
2
star
37

spool

Keeping track of events using Redis. Naive and exploratory. May be abandoned at any time.
JavaScript
2
star
38

flux-whatif

An experimental tool that tells you what would happen if Flux objects were changed
Go
2
star
39

multimethods-for-javascript

Multimethods (on argument type) for JavaScript **EXPERIMENT**
JavaScript
1
star
40

pyzmetal

Message Transfer Layer, in Python
Python
1
star
41

kpt-helm-demo

Demonstration of making a reusable kpt package from a Helm chart
Shell
1
star
42

mtl

Message Transfer Layer server
Erlang
1
star
43

pko-dev

Repo with projects for messing around with Pulumi Kubernetes Operator
Go
1
star
44

jepsen-training

Code written at Jepsen training workshop
Clojure
1
star
45

rmqcat

netcat-like tool for sending things through RabbitMQ
JavaScript
1
star
46

git-serv

Minimal git server container image
Makefile
1
star
47

noodle

Lazy sequences, event streams and combinators, in portable-ish JavaScript
CoffeeScript
1
star
48

dotfiles

Script and data for managing my dotfiles, emacs.d and so on.
Shell
1
star
49

squaremo.mqtt

Bits of MQTT toolage in Clojure
Clojure
1
star
50

wyas

I did it myyyyyy wayyyyyyyy
Haskell
1
star
51

microwiki

Example for flux, mainly
JavaScript
1
star