• Stars
    star
    104
  • Rank 330,604 (Top 7 %)
  • Language
    Rust
  • License
    Other
  • Created about 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

JumpRope

Because inserting into a string should be fast.

A rope is a data structure for efficiently editing large strings, or for processing editing traces.

As far as I know, JumpRope is the world's fastest rope implementation.

Unlike traditional strings, JumpRope allows you to:

  • Efficiently insert or delete arbitrary keystrokes from anywhere in the document. Using real world editing traces, jumprope can process about 35-40 million edits per second.
  • Index using unicode character offsets or wchar offsets (like you find in JS and other languages). Jumprope can efficiently convert between these formats.

JumpRope is optimized for large strings like source code files and text documents. If your strings are very small (less than 100 bytes), you should probably just use Rust's built in std String or a small-string-optimized string library like SmartString.

JumpRope is similar to ropey. Ropey supports a few more features (like converting line/column positions). However, jumprope is about 3x faster than ropey when processing real editing operations (see below) and jumprope compiles to a smaller wasm bundle. (Ropey is 30kb brotli compressed, vs 18kb for jumprope).

API documentation

Jumprope on crates.io

Add this to Cargo.toml to use:

jumprope = "1.0.0"

Usage

JumpRope isn't a drop-in replacement for string, but it supports many similar methods. The most important additions are the insert, remove and replace methods - which let you edit strings in-place in (typically) log(n) time relative to the size of the existing document.

use jumprope::JumpRope;

fn main() {
    let mut rope = JumpRope::from("Some large text document");
    rope.insert(5, "really "); // "Some really large text document"
    rope.replace(0..4, "My rad");  // "My rad really large text document"
    assert_eq!(rope, "My rad really large text document");

    // Extract to a string
    let s: String = rope.to_string();
    assert_eq!(s, "My rad really large text document");
}

You can read content back out of a rope by:

  • Converting the rope to a string using rope.to_string() (requires allocations)
  • Iterating over characters using rope.chars()
  • (Fastest) iterating over &str chunks with rope.substrings(). This returns an iterator over contained &str items in the document.

If you want to read a subsection of the rope, you can use rope.slice_substrings(10..20) to read all the content within a given range in the rope. Eg:

fn main() {
    let rope = JumpRope::from("xxxGreetings!xxx");

    let string = rope.slice_substrings(3..13).collect::<String>();
    assert_eq!(string, "Greetings!");
}

For more details, see JumpRope API documentation

Wchar conversion

In some languages (notably Javascript, Java and C#) strings are measured by the number of 2-byte "characters" needed when encoding the string using UTF16.

This is awkward because its difficult to efficiently convert between unicode character offsets (used by jumprope, diamond types and other editors) and these editing locations. The naive approach is an O(n) operation.

Jumprope supports doing this conversion in O(log n) time, by adding extra indexing information to the skip list. This feature is disabled by default, because the extra bookkeeping slows down jumprope by about 15%.

To use this feature, enable the wchar_conversion feature flag:

jumprope = { version = "1.0.0", features = ["wchar_conversion"] }

This feature flag enables a bunch of extra wchar-related methods for interacting with a document:

  • rope.len_wchars() -> usize: Return the length of the string in wchars.
  • rope.chars_to_wchars(chars: usize) -> usize: Convert a char offset to a wchar offset
  • rope.wchars_to_chars(wchars: usize) -> usize: Convert a wchar index back to a unicode character count
  • rope.insert_at_wchar(pos_wchar: usize, content: &str): Insert content at the specified wchar offset
  • rope.remove_at_wchar(range: Range<usize>): Remove the specified range, specified using wchar offsets
  • rope.replace_at_wchar(range: Range<usize>, content: &str): Replace the specified range with content

See documentation on docs.rs for more information about these methods.

Buffered strings

JumpRope also has an API for buffered edits. Usually when humans edit a string, they insert or delete runs of characters. If you merge these editing runs together before applying them, jumprope is about 10x faster again.

Jumprope provides a wrapper API to do this transparently in the form of JumpRopeBuf. JumpRopeBuf does a best-effort attempt to merge incoming writes together before flushing (writing) them to the contained jumprope object.

This API may be missing some methods found on JumpRope. You can usually work around any missing methods by calling rope.borrow() or rope.as_mut() to flush pending changes and access a pointer to the underlying rope. But please file issues if you find any missing functions, because adding direct implementations will usually result in better performance.

See JumpRopeBuf module documentation for usage.

History / motivation

This code is based on an older skiplist based C rope library I wrote several years ago as an excuse to play with skip lists. It has a few notable differences:

  • Instead of simply being implemented as a skiplist, jumprope is a skiplist where each leaf node contains a Gap Buffer.
  • Jumprope is faster. (See table below)

Benchmarks

Running the editing traces from crdt-benchmarks, jumprope is faster than any other library in cargo that I know of:

Running on a single core of a Ryzen 5800X:

Dataset Raw string XiRope Ropey librope (C) Jumprope
automerge-paper 3908.13 ms 518.75 ms 25.16 ms 16.28 ms 6.66 ms
rustcode 569.44 ms DNF 4.71 ms 3.93 ms 1.66 ms
sveltecomponent 41.05 ms 24.83 ms 2.31 ms 1.59 ms 0.59 ms
seph-blog1 1238.44 ms DNF 13.04 ms 10.01 ms 3.81 ms

Full criterion report is here.

I tried AnRope as well, but it crashed while processing these datasets.

LICENSE

Licensed under the ISC license:

Copyright 2018 Joseph Gentle

Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

More Repositories

1

ShareJS

Collaborative editing in any app
JavaScript
4,929
star
2

noisejs

Javascript 2D Perlin & Simplex noise functions
JavaScript
1,569
star
3

diamond-types

The world's fastest CRDT. WIP.
Rust
1,194
star
4

Chipmunk-js

Port of slembcke/Chipmunk-Physics to Javascript
JavaScript
537
star
5

node-browserchannel

An implementation of a google browserchannel server in node.js
CoffeeScript
287
star
6

librope

UTF-8 rope library for C
C
254
star
7

statecraft

Manage state with finesse
TypeScript
177
star
8

sephsplace

My own version of r/place, done in a weekend
JavaScript
130
star
9

reference-crdts

Simple, tiny spec-compliant reference implementations of Yjs and Automerge's list types.
TypeScript
98
star
10

jumprope

Fast string editing in Javascript using skip lists
JavaScript
86
star
11

steamdance

A multiplayer steam-based CPU simulator
JavaScript
45
star
12

editing-traces

Real world text editing traces for benchmarking CRDT and Rope data structures
JavaScript
39
star
13

schemaboi

Mergable, efficient data schema for intercompatible apps
TypeScript
33
star
14

braid-protocol

TypeScript
28
star
15

textot.rs

Text operational transform library, for rust. Compatible with libot, ottypes/text.
Rust
26
star
16

sharedb

ShareJS server, in C.
C
24
star
17

appstate

CoffeeScript
24
star
18

replica

Local first application platform built using CRDTs
TypeScript
18
star
19

wangjs

A javascript Wang tile implementation
CoffeeScript
17
star
20

eg-walker-reference

Simple reimplementation of the diamond types sequence CRDT in simple, pure, unoptimized typescript.
TypeScript
16
star
21

mail-viewer

Pure browser email viewer
JavaScript
15
star
22

diamond-js

Javascript wrapper bindings for diamond types
14
star
23

fdb-tuple

Pure javascript FoundationDB tuple encoder and decoder
TypeScript
12
star
24

node-timerstub

Stubbed out timer objects for fast unit testing in nodejs
CoffeeScript
11
star
25

gmail-jmap

Simple JMAP wrapper for gmail
TypeScript
10
star
26

braid-db

Simple database prototype
Rust
10
star
27

unicount

JavaScript
9
star
28

react-static-example

JavaScript
9
star
29

mime-to-jmap

Tools to convert RFC2822 email messages to JMAP
C
9
star
30

unsplash

JavaScript
8
star
31

egwalker-paper

Eg-walker paper, experiments and data.
Rust
8
star
32

tanksalot

CoffeeScript
7
star
33

lightwave

Fork of Torben Weis's lightwave experimental branch - http://code.google.com/p/lightwave/
Go
7
star
34

TP2

An implementation of OT for text which supports TP2 (for p2p editing)
CoffeeScript
6
star
35

text-crdt-rust2

Rust
6
star
36

miniohm

JavaScript
6
star
37

simple-crdt-text

Simple typescript reference implementation for a simple RGA based plain text CRDT
TypeScript
6
star
38

axidraw

Python
5
star
39

livedb-foundation

CoffeeScript
5
star
40

dt-simple-wiki

World's simplest wiki on top of diamond types
JavaScript
5
star
41

crdt-examples

CRDT examples from a DWEB talk
JavaScript
5
star
42

glassbeadtimer

Svelte
4
star
43

killdinohitler

Stealth jam game
CoffeeScript
4
star
44

lovehot

Lua
4
star
45

keykitten

Stores your public key.
4
star
46

ARTIST

Michael Brough and Andi McClure's 10 second artist game, allowing a starting image
Lua
4
star
47

skiplistrs

Rust
4
star
48

map2

Tuple map type for javascript (a,b) -> x
TypeScript
3
star
49

gossip

Simple proof-of-concept gossip protocol for CRDTs
TypeScript
3
star
50

tp2stuff

CoffeeScript
3
star
51

set2

Tuple set type for javascript
JavaScript
3
star
52

resolvable

TypeScript
2
star
53

libot-swift

Swift
2
star
54

roboname

A simple robot-themed name generator for nodejs
CoffeeScript
2
star
55

braid-explorer

UI experiment with braid for data first software
TypeScript
2
star
56

space

Spaaaaaaaace
JavaScript
2
star
57

robot-party

Robots run robots.
CoffeeScript
2
star
58

streamtoiter

TypeScript
2
star
59

Attila

a little game prototype
CoffeeScript
2
star
60

wave-prototype

Simple prototype of a wave-like email service
TypeScript
2
star
61

braid-kernel

TypeScript
2
star
62

canvasproxy

CoffeeScript
2
star
63

space-server

C
2
star
64

census-couch

A simple couchapp site for exploring census data. Way incomplete.
JavaScript
1
star
65

state-client

TypeScript
1
star
66

Clickboss

Click on the boss.
CoffeeScript
1
star
67

FactoryFactoryImpl

The code from my minecraft machine that makes anything
1
star
68

braid-cli

JavaScript
1
star
69

mars2

JavaScript
1
star
70

boilerplate-sim

Boilerplate simulator reference implementation
CoffeeScript
1
star
71

ietf-sync-prototype

JavaScript
1
star
72

boilerplate-gif

CoffeeScript
1
star
73

mtgox

CoffeeScript
1
star
74

boilerplate-jit

CoffeeScript
1
star
75

raider

Simple ECS-style game prototype for multiplayer dungeon raid type things. https://home.seph.codes/public/game/
Rust
1
star
76

state-server

TypeScript
1
star
77

Cocontroller

Userland USB controller for XBox360 controllers on a mac.
Objective-C
1
star
78

boilerplate-compiler

CoffeeScript
1
star
79

visbindiff

CoffeeScript
1
star
80

rle-utils

Javascript utilities for internal run-length encoding
TypeScript
1
star
81

mars

CSS
1
star
82

boilerbot

CoffeeScript
1
star