• Stars
    star
    139
  • Rank 262,954 (Top 6 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created over 8 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Binary-to-text encoding highly optimised for UTF-16

base32768

Base32768 is a binary encoding optimised for UTF-16-encoded text. This JavaScript module, base32768, is the first implementation of this encoding.

The efficiency chart speaks for itself. Efficiency ratings are averaged over long inputs. Higher is better.

Encoding Efficiency Bytes per Tweet *
UTF‑8 UTF‑16 UTF‑32
ASCII‑constrained Unary / Base1 0% 0% 0% 1
Binary 13% 6% 3% 35
Hexadecimal 50% 25% 13% 140
Base64 75% 38% 19% 210
Base85 † 80% 40% 20% 224
BMP‑constrained HexagramEncode 25% 38% 19% 105
BrailleEncode 33% 50% 25% 140
Base2048 56% 69% 34% 385
Base32768 63% 94% 47% 263
Full Unicode Ecoji 31% 31% 31% 175
Base65536 56% 64% 50% 280
Base131072 53%+ 53%+ 53% 297

* New-style "long" Tweets, up to 280 Unicode characters give or take Twitter's complex "weighting" calculation.
† Base85 is listed for completeness but all variants use characters which are considered hazardous for general use in text: escape characters, brackets, punctuation etc..
‡ Base131072 is a work in progress, not yet ready for general use.

Base32768 uses only "safe" Unicode code points - no unassigned code points, no whitespace, no control characters, etc..

Installation

npm install base32768

Usage

import { encode, decode } from 'base32768'

const uint8Array = new Uint8Array([104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100])
const str = encode(uint8Array)
console.log(str)
// 6 code points, '媒腻㐤┖ꈳ埳'

const uint8Array2 = decode(str)
console.log(uint8Array2)
// [104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100]

API

base32768.encode(uint8Array)

Encodes a Uint8Array and returns a Base32768 String. Note that every Node.js Buffer is a Uint8Array.

The string is suitable for passing safely through almost any "Unicode-clean" text-handling API. This string contains no special characters and is immune to Unicode normalization. Give or take some padding characters, the output string has 1 character per 15 bits of input.

All characters are chosen from the Basic Multilingual Plane. This means that when encoded as UTF-16, all characters occupy 16 bits. Thus, there are 16 bits of output UTF-16 text per 15 bits of input, an efficiency of 93.75%.

base32768.decode(str)

Decodes a Base32768 String and returns a Uint8Array containing the original binary data. Note that a Uint8Array can be converted to a Node.js Buffer like so:

const buffer = Buffer.from(uint8Array.buffer, uint8Array.byteOffset, uint8Array.byteLength)

License

MIT

More Repositories

1

base65536

Unicode's answer to Base64
JavaScript
2,078
star
2

base2048

Binary encoding optimised for Twitter
JavaScript
833
star
3

hatetris

Tetris which always gives you the worst piece
TypeScript
831
star
4

greenery

Regular expression manipulation library
Python
331
star
5

fastjson

Single-tweet, standards-compliant, high-performance JSON stack
JavaScript
107
star
6

loco

Parsing library for PHP
PHP
89
star
7

base131072

Binary-to-text encoding optimised for Twitter & UTF-32
JavaScript
84
star
8

base1

Binary encoding inspired by unary numbers
JavaScript
71
star
9

abcdefghijklmnopqrstuvwxyz

The English alphabet
JavaScript
71
star
10

hexagram-encode

Represent binary data using I Ching hexagrams
JavaScript
54
star
11

t-a-i

Converts Unix milliseconds to and from International Atomic Time (TAI) milliseconds
JavaScript
43
star
12

braille-encode

Represent binary data as Braille
JavaScript
41
star
13

scp-3125

Source code for the SCP Foundation wiki entry "SCP-3125"
HTML
35
star
14

big-roman

Big Roman numerals
JavaScript
18
star
15

tetris

Attempt to find a brute-force solution to Tetris
C
17
star
16

safe-code-point

Ascertains whether a Unicode code point is 'safe' for the purposes of encoding binary data
JavaScript
16
star
17

hyperoperate

Hyperoperations for JavaScript!
JavaScript
12
star
18

broken-promises-aplus

Compliant Promises/A+ implementation which doesn't actually work
JavaScript
12
star
19

base65537

It's one better
JavaScript
11
star
20

minify-numeric-literal

Minify numeric literals for JavaScript
JavaScript
6
star
21

big-round

Custom rounding behaviour for JavaScript BigInt arithmetic
JavaScript
5
star
22

tai-date

A TaiDate stores an instant in TAI, the same way that a Date stores an instant in Unix time
JavaScript
4
star
23

base65536-test

Language-agnostic test case files for the Base65536 encoding
4
star
24

base65536-stream

Streaming implementation of the Base65536 encoding
JavaScript
3
star
25

green-reg-exp

A little library for manipulating regular expressions.
JavaScript
3
star
26

green-fsm

A basic little library for finite state machines
JavaScript
2
star
27

green-parse

A little recursive descent parsing library
JavaScript
1
star