• Stars
    star
    126
  • Rank 282,887 (Top 6 %)
  • Language
    C
  • License
    BSD 3-Clause "New...
  • Created over 3 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The bit level data interchange format for serializing data structures (long term maintenance).

The bit level data interchange format

https://readthedocs.org/projects/bitproto/badge/?version=latest https://img.shields.io/badge/license-BSD3-brightgreen

Introduction

Bitproto is a fast, lightweight and easy-to-use bit level data interchange format for serializing data structures.

The protocol describing syntax looks like the great protocol buffers , but in bit level:

proto example

message Data {
    uint3 the = 1
    uint3 bit = 2
    uint5 level = 3
    uint4 data = 4
    uint11 interchange = 6
    uint6 format = 7
}  // 32 bits => 4B

The Data above is called a message, it consists of 7 fields and will occupy a total of 4 bytes after encoding.

This image shows the layout of data fields in the encoded bytes buffer:

docs/_static/images/data-encoding-sample.png

Code Example

Code example to encode bitproto message in C:

struct Data data = {.the = 7,
                    .bit = 7,
                    .level = 31,
                    .data = 15,
                    .interchange = 2047,
                    .format = 63};
unsigned char s[BYTES_LENGTH_DATA] = {0};
EncodeData(&data, s);
// length of s is 4, and the hex format is
// 0xFF 0xFF 0xFF 0xFF

And the decoding example:

struct Data d = {0};
DecodeData(&d, s);
// values of d's fields is now:
// 7 7 31 15 2047 63

Simple and green, isn't it?

Code patterns of bitproto encoding are exactly similar in C, Go and Python.

Features

  • Supports bit level data serialization, born for embedded development.
  • Supports protocol extensiblity , for forward-compatibility.
  • Easy to start, syntax is similar to the well-known protobuf.
  • Supports languages: C (without dynamic memory allocation), Go, Python .
  • Blazing fast encoding/decoding, benchmark.
  • We can clearly know the size and arrangement of encoded data, fields are compact without a single bit gap.

Schema Example

An example for a simple overview of the bitproto schema grammar:

proto pen

// Constant value
const SIZE = 2 * 3;

// Bit level enum.
enum Color : uint3 {
    COLOR_UNKNOWN = 0
    COLOR_RED = 1
    COLOR_BLUE = 2
    COLOR_GREEN = 3
}

// Type alias
type Timestamp = int64

// Composite structure
message Pen {
    Color color = 1
    Timestamp produced_at = 2
    uint3 number = 3
    uint13 value = 4
}

message Box {
    // Fixed-size array
    Pen[SIZE] pens = 1;
}

Run the bitproto compiler to generate C files:

$ bitproto c pen.bitproto

Which generates two files: pen_bp.h and pen_bp.c.

We can have an overview of the generated code for the C language:

// Constant value
#define SIZE 6

// Bit level enum.
typedef uint8_t Color; // 3bit

#define COLOR_UNKNOWN 0
#define COLOR_RED 1
#define COLOR_BLUE 2
#define COLOR_GREEN 3

// Type alias
typedef int64_t Timestamp; // 64bit

// Number of bytes to encode struct Pen
#define BYTES_LENGTH_PEN 11

// Composite structure
struct Pen {
    Color color; // 3bit
    Timestamp produced_at; // 64bit
    uint8_t number; // 3bit
    uint16_t value; // 13bit
};

// Number of bytes to encode struct Box
#define BYTES_LENGTH_BOX 63

struct Box {
    // Fixed-size array
    struct Pen pens[6]; // 498bit
};

You can checkout directory example for a larger example.

Why bitproto ?

There is protobuf, why bitproto?

Origin

The bitproto was originally made when I'm working with embedded programs on micro-controllers. Where usually exists many programming constraints:

  • tight communication size.
  • limited compiled code size.
  • better no dynamic memory allocation.

Protobuf does not live on embedded field natively, it doesn't target ANSI C out of box.

Scenario

It's recommended to use bitproto over protobuf when:

  • Working on or with microcontrollers.
  • Wants bit-level message fields.
  • Wants to know clearly how many bytes the encoded data will occupy.

For scenarios other than the above, I recommend to use protobuf over bitproto.

Vs Protobuf

The differences between bitproto and protobuf are:

  • bitproto supports bit level data serialization, like the bit fields in C.

  • bitproto doesn't use any dynamic memory allocations. Few of protobuf C implementations support this, except nanopb.

  • bitproto doesn't support varying sized data, all types are fixed sized.

    bitproto won't encode typing or size reflection information into the buffer. It only encodes the data itself, without any additional data, the encoded data is arranged like it's arranged in the memory, with fixed size, without paddings, think setting aligned attribute to 1 on structs in C.

  • Protobuf works good on forward compatibility. For bitproto, this is the main shortcome of bitproto serialization until v0.4.0, since this version, it supports message's extensiblity by adding two bytes indicating the message size at head of the message's encoded buffer. This breaks the traditional data layout design by encoding some minimal reflection size information in, so this is designed as an optional feature.

Known Shortcomes

  • bitproto doesn't support varying sized types. For example, a unit37 always occupies 37 bits even you assign it a small value like 1.

    Which means there will be lots of zero bytes if the meaningful data occupies little on this type. For instance, there will be n-1 bytes left zero if only one byte of a type with n bytes size is used.

    Generally, we actually don't care much about this, since there are not so many bytes in communication with embedded devices. The protocol itself is meant to be designed tight and compact. Consider to wrap a compression mechanism like zlib on the encoded buffer if you really care.

  • bitproto can't provide best encoding performance with extensibility.

    There's an optimization mode designed in bitproto to generate plain encoding/decoding statements directly at code-generation time, since all types in bitproto are fixed-sized, how-to-encode can be determined earlier at code-generation time. This mode gives a huge performance improvement, but I still haven't found a way to make it work with bitproto's extensibility mechanism together.

Documentation and Links

Documentation:

Editor syntax highlighting plugins:

Faq:

Blog posts:

License

BSD3

More Repositories

1

img2txt

Image to Ascii Text with color support, can output to html or ansi terminal.
HTML
933
star
2

gif2txt

Gif image to Ascii Text
Python
421
star
3

skylark

No longer maintained. A micro python orm for mysql and sqlite3.
Python
185
star
4

todo.c

Command line lightweight todo tool with readable storage , written in C.
C
132
star
5

tcptee

tcptee is a simple tcp traffic duplicator.
Go
128
star
6

oo

Simple Go Version Manager (still works).
Shell
93
star
7

htree

Package htree implements the in-memory hash tree. Hacker News: https://news.ycombinator.com/item?id=11369676
Go
92
star
8

GhResume

Another resume generator for github users. Seems still works, so many years...
CSS
65
star
9

dotfiles

❤ My ~/.dotfiles for fish/[n]vim/tmux/git/alacritty.
Vim Script
62
star
10

rux

Micro & Fast static blog generator (markdown => html).
Python
60
star
11

C-Snip

My C Snippets - buf, cfg, datetime, dict, event, heap, ketama, list, log, map, queue, skiplist, stack, strings
C
53
star
12

statsd-proxy

Fast consistent hashing proxy for etsy/statsd (no longer maintained ⚠️).
C
47
star
13

oldblog

old blog of hit9.org
HTML
45
star
14

C-list

Singly linked list in C. Alternative implementation at https://github.com/hit9/C-Snip
C
39
star
15

code-snippets

My code snippets, mostly for blog https://writings.sh
C++
24
star
16

Go-patterns-with-channel

Some example patterns using channel, goroutines
Go
21
star
17

C-dict

hashtable(bkdr hash) in C. Alternative implementation checkout https://github.com/hit9/C-Snip
C
18
star
18

ntt

Node Tiny Tests module.
JavaScript
18
star
19

toml.py

Tested Toml's Python implementation.
Python
18
star
20

compile_time_regexp.h

A simple compile time dfa based regular expression library for C++20.
C++
17
star
21

sphinx-theme-rux

A no-sidebar red sexy sphinx theme
CSS
16
star
22

mdconf.py

Python implementation for visionmedia's mdconf.
Python
11
star
23

mkdwiki2

Write wiki in GitHub Flavored Markdown
Python
10
star
24

dataclass-jsonable

Simple, practical and overridable conversions between dataclasses and jsonable dictionaries (long term maintenance).
Python
9
star
25

bbuf

Dynamic bytes buffer for nodejs/iojs, a bit like the bytearray in Python.
C++
9
star
26

google.js

🙏 Search available fast google ip for blocked users.
Shell
9
star
27

ipv4.js

IPv4 utils for nodejs/iojs.
JavaScript
8
star
28

reuseport

Go reuseport for tcp/udp
Go
8
star
29

ssdb.api.docs

Unofficial SSDB API Documentation
8
star
30

skiplist

Package skiplist implements in-memory skiplist (long term maintenance).
Go
7
star
31

Simple-Pools

Simple `Thread Pool` and `Process Pool` implementation for Python
Python
7
star
32

gmls

📚 GitHub Markdown Local Server. - Read Markdown Wikis Offline
Python
7
star
33

go-ipaddress

IP address utils for golang.
Go
6
star
34

bt.cc

A lightweight C++ behavior tree library that separates data and behavior.
C++
6
star
35

rux-theme-clr

Another theme for rux with a sidebar
CSS
5
star
36

md2pdf

[ Outdated and Deprecated ] Convert single markdown file to pdf.
Python
5
star
37

spp_lua

SSDB Protocol Parser For Lua, Built For Speed. (Included in https://github.com/eleme/lua-resty-ssdb)
C
5
star
38

node-block-queue

Nodejs in-process blocking FIFO queue implementation.
JavaScript
5
star
39

spp_node

(No more mantained) SSDB Protocol Parser For Node, Built For Speed.
C
4
star
40

Firing_squad_synchronization_problem

Firing squad synchronization problem 15-states solution in Python.
Python
4
star
41

ketama

Package ketama implements a consistent hashing ring (long term maintenance).
Go
4
star
42

bytes.js

Utf8 bytes from/to string for nodejs/iojs.
JavaScript
4
star
43

safemap

Goroutine-safe map.
Go
4
star
44

trie

Package trie implements a in-memory trie tree (long term maintenance).
Go
3
star
45

diskstack

Package diskstack implements disk-based stack.
Go
3
star
46

blinker.h

A lightweight signal/event library for C++, similar to Python's blinker, but designed to work with ticking loops.
C++
3
star
47

onChanges.py

Watch given files and run certain command on changes.
Python
2
star
48

simple-utf8-cpp

Tiny simple library to convert between utf8 bytes and char32_t codepoints in C++
C++
2
star
49

spp_py

SSDB Protocol Parser For Python, Built For Speed.
C
2
star
50

pdfsm.h

A simple pushdown finite states machine library in C++, separating data and behaviors as much as possible.
C++
2
star
51

flask-idempotent2

Redis based idempotent support for sqlalchemy based flask applications (no longer maintained).
Python
2
star
52

heapq.js

Heap queue implementation for nodejs/iojs.
JavaScript
2
star
53

log

Package log implements leveled logging.
Go
2
star
54

zhuanlan2pdf

zhuanlan.zhihu.com/xxx => xxx.pdf (For Offline Reading)
CSS
2
star
55

tinyecs

Tiny archetype-based ECS library (C++)
C++
2
star
56

logging.js

Stream based logging module for nodejs/iojs.
JavaScript
1
star
57

flask-docjson

Validate flask request and response json schemas via docstring (no longer maintained).
Python
1
star
58

flask-sign-in-with-github.py

Sign in with github via github V3 API in Flask
Python
1
star
59

idpool

Package idpool implements a reusable integer id pool.
Go
1
star
60

promisify.js

Promisify module for node.js/io.js.
JavaScript
1
star
61

sphinx-theme-plain

A clean sphinx theme
CSS
1
star
62

rux-theme-default

github.com/hit9/rux.git default theme
CSS
1
star
63

beanstats

Little beanstalkd console tool, it will watch a single tube and show you how fast jobs are going in and out of your queue.
JavaScript
1
star
64

create-error.js

Create custom error types for nodejs/iojs.
JavaScript
1
star
65

segbitset

A failed attempt to implement a hierarchical bitset, for sparse bit data, based on segment tree.
C++
1
star