Concise Encoding
The secure data format for a modern world
π‘ Solving today's problems
Times are different from the carefree days that brought us XML and JSON:
π Security (protect your data and infrastructure)
State actors, criminal organizations and mercenaries are now actively hacking governments, companies and individuals to steal secrets, plant malware, and hold your data hostage.
The existing ad-hoc data formats are too loosely defined to be secure, and can't be fixed because they're not versioned.
Concise Encoding is designed for security, and is versioned so that it can be updated to handle new threats.
π Efficiency (but not at the cost of convenience)
We send so much data now that efficiency is critical, but switching to binary means giving up the ease of text formats.
... or does it?
Concise Encoding gives you ease and efficiency with its 1:1 compatible text and binary formats.
𧬠Types (because stringifying everything is wasteful)
Lack of types forces everyone to add extra encoding steps to send their data, which is buggy, reduces compatibility, and opens even more security holes.
We live in the 21st century - base64 should be a footnote in history by now!
Concise Encoding supports all of the common types natively. No more encoding things into strings.
π Compared to other formats
Features
Type | CE | XML | JSON | BSON | CBOR | Protobufs | Thrift | ASN.1 | Ion |
---|---|---|---|---|---|---|---|---|---|
Int Max Size (bits) | β | β | 53 | 64 | 64 | 64 | 64 | 64 | β |
Float Max Size (bits) | β | β | 64 | 128 | 64 | 64 | 64 | 64 | β |
Subsecond Precision | ns | ns | ns | ns | ns | ns | |||
Ad-hoc | β | βοΈ | |||||||
Little Endian | β | ||||||||
Non-string map keys | |||||||||
Size Optimization | β | β | β | ||||||
Cyclic Data | β | β | β | ||||||
Time Zones | βοΈ | β | β | β | |||||
Records | βοΈ | β | β | ||||||
Bin + Txt | βοΈ | β | β | β | |||||
Versioned | βοΈ | β |
- Ad-hoc: Supports ad-hoc data (does not require a schema).
- Little Endian: Uses little-endian (modern CPUs use little endian, making little endian formats more efficient).
- Size Optimization: The most common types and values use less space.
- Cyclic Data: Supports cyclic (recursive) data structures.
- Time Zones: Time types support real time zones.
- Records: Records separate definition and instance for frequently occurring structures.
- Bin + Txt: Has twin binary and text formats that are 1:1 convertible to each other without data loss.
- Versioned: Documents are versioned to the specification they adhere to. (Ion supports versioning in the binary format only).
Type Support
Type | CE | XML | JSON | BSON | CBOR | Protobufs | Thrift | ASN.1 | Ion |
---|---|---|---|---|---|---|---|---|---|
Boolean | β | βοΈ | βοΈ | ||||||
Integer | β | βοΈ | βοΈ | ||||||
Binary Float | βοΈ | βοΈ | |||||||
Bfloat | β | β | β | ||||||
Decimal Float | βοΈ | ||||||||
NaN, Infinity | βοΈ | βοΈ | βοΈ | ||||||
Universal ID | βοΈ | βοΈ | |||||||
Timestamp | |||||||||
Resource ID | βοΈ | β | β | ||||||
String | |||||||||
Bytes | β | βοΈ | βοΈ | ||||||
List | β | βοΈ | βοΈ | ||||||
Map | β | ||||||||
Edge | β | β | β | β | β | ||||
Node | β | ||||||||
Record | β | β | |||||||
Typed Arrays | βοΈ | ||||||||
Reference | βοΈ | ||||||||
Remote Ref | β | β | |||||||
Comment | β | β | |||||||
Null | β | βοΈ | |||||||
Media | βοΈ | β | β | β | |||||
Custom | βοΈ | β |
π Specifications and Code
Specifications
𧬠Concise Encoding Structure (describes the structure and rules that both formats follow)π‘ Concise Binary Encoding (CBE) (describes the binary format encoding)π₯ Concise Text Encoding (CTE) (describes the text format encoding)
Note: Most applications will only need the binary format. The text format is only required in places where a human must get involved, and this can often be handled by a simple command-line tool.
Design
π Design Document explains the design choices behind Concise Encoding.
Grammar
Implementations
βοΈ Go Implementation (reference implementation)
Tools
π οΈ Enctool A tool for converting between formats
β οΈ Draft Specification
Although Concise Encoding is nearing a release, it's currently a draft specification and thus subject to change. Please use a version of 0
for now to avoid compatibility issues with existing documents when version 1 is released.
Note: When version 1 is released, 0
will no longer be a valid version number.
π Examples
All examples are valid Concise Text Encoding documents that can be transparently 1:1 converted to/from Concise Binary Encoding.
Numeric Types
c1
{
"boolean" = true
"binary int" = -0b10001011
"octal int" = 0o644
"decimal int" = -10000000
"hex int" = 0xfffe0001
"very long int" = 100000000000000000000000000000000000009
"decimal float" = -14.125
"hex float" = 0x5.1ec4p+20
"very long flt" = 4.957234990634579394723460546348e+100000
"not-a-number" = nan
"infinity" = inf
"neg infinity" = -inf
}
String and String-Like
c1
{
"string" = "Strings support escape sequences: \n \t \[1f415]"
"url" = @"https://example.com/"
"email" = @"mailto:[email protected]"
}
Other Basic Types
c1
{
"uuid" = f1ce4567-e89b-12d3-a456-426655440000
"date" = 2019-07-01
"time" = 18:04:00.948/Europe/Prague
"timestamp" = 2010-07-15/13:28:15.415942344
"null" = null
"media" = @application/x-sh[23 21 2f 62 69 6e 2f 73 68 0a 0a
65 63 68 6f 20 68 65 6c 6c 6f 20 77 6f 72 6c 64 0a]
}
Containers
c1
{
"list" = [1 2.5 "a string"]
"map" = {"one"=1 2="two" "today"=2020-09-10}
"bytes" = @u8x[01 ff de ad be ef]
"int16 array" = @i16[7374 17466 -9957]
"uint16 hex" = @u16x[91fe 443a 9c15]
"float32 array" = @f32[1.5e10 -8.31e-12]
}
Records
c1
@vehicle<"make" "model" "drive" "sunroof"> // type
[
@vehicle{"Ford" "Explorer" "4wd" true } // instance
@vehicle{"Toyota" "Corolla" "fwd" false} // instance
]
Which is equivalent to:
c1
[
{
"make" = "Ford"
"model" = "Explorer"
"drive" = "4wd"
"sunroof" = true
}
{
"make" = "Toyota"
"model" = "Corolla"
"drive" = "fwd"
"sunroof" = false
}
]
Trees
c1
/* The tree:
*
* 2
* / \
* 5 7
* / /|\
* 9 6 1 2
* / / \
* 4 8 5
*
*/
(2
(7
2
1
(6
5
8
)
)
(5
(9
4
)
)
)
Notice how when rotated 90Β°, it resembles the tree it represents:
Graphs
c1
/* The weighted graph:
*
* b
* /|\
* 4 1 1
* / | \
* a-3-c-4-d
*
*/
{
"vertices" = [
&a:{}
&b:{}
&c:{}
&d:{}
]
"edges" = [
@($a {"weight"=4 "direction"="both"} $b)
@($a {"weight"=3 "direction"="both"} $c)
@($b {"weight"=1 "direction"="both"} $c)
@($b {"weight"=1 "direction"="both"} $d)
@($c {"weight"=4 "direction"="both"} $d)
]
}
References
c1
{
// Entire map will be referenced later as $id1
"marked object" = &id1:{
"recursive" = $id1
}
"ref1" = $id1
"ref2" = $id1
// Reference pointing to part of another document.
"outside ref" = $"https://xyz.com/document.cte#some_id"
}
Custom Types
c1
{
// Custom types are user-defined, with user-supplied codecs.
// In this example, we assume that custom type 12 is registered
// via a schema to a custom "complex number" type.
"custom text" = @12"2.94+3i"
"custom binary" = @12[04 f6 28 3c 40 00 00 40 40]
}
License
Copyright (c) 2018-2023 Karl Stenerud. All rights reserved.
Distributed under the Creative Commons Attribution License (license deed.