• Stars
    star
    248
  • Rank 163,560 (Top 4 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created over 9 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Compact string type for zero-copy parsing

tendril

Warning: This library is at a very early stage of development, and it contains a substantial amount of unsafe code. Use at your own risk!

Build Status

API Documentation

Introduction

Tendril is a compact string/buffer type, optimized for zero-copy parsing. Tendrils have the semantics of owned strings, but are sometimes views into shared buffers. When you mutate a tendril, an owned copy is made if necessary. Further mutations occur in-place until the string becomes shared, e.g. with clone() or subtendril().

Buffer sharing is accomplished through thread-local (non-atomic) reference counting, which has very low overhead. The Rust type system will prevent you at compile time from sending a tendril between threads. (See below for thoughts on relaxing this restriction.)

Whereas String allocates in the heap for any non-empty string, Tendril can store small strings (up to 8 bytes) in-line, without a heap allocation. Tendril is also smaller than String on 64-bit platforms β€” 16 bytes versus 24. Option<Tendril> is the same size as Tendril, thanks to NonZero.

The maximum length of a tendril is 4 GB. The library will panic if you attempt to go over the limit.

Formats and encoding

Tendril uses phantom types to track a buffer's format. This determines at compile time which operations are available on a given tendril. For example, Tendril<UTF8> and Tendril<Bytes> can be borrowed as &str and &[u8] respectively.

Tendril also integrates with rust-encoding and has preliminary support for WTF-8 buffers.

Plans for the future

Ropes

html5ever will use Tendril as a zero-copy text representation. It would be good to preserve this all the way through to Servo's DOM. This would reduce memory consumption, and possibly speed up text shaping and painting. However, DOM text may conceivably be larger than 4 GB, and will anyway not be contiguous in memory around e.g. a character entity reference.

Solution: Build a rope on top of these strings and use that as Servo's representation of DOM text. We can perhaps do text shaping and/or painting in parallel for different chunks of a rope. html5ever can additionally use this rope type as a replacement for BufferQueue.

Because the underlying buffers are reference-counted, the bulk of this rope is already a persistent data structure. Consider what happens when appending two ropes to get a "new" rope. A vector-backed rope would copy a vector of small structs, one for each chunk, and would bump the corresponding refcounts. But it would not copy any of the string data.

If we want more sharing, then a 2-3 finger tree could be a good choice. We would probably stick with VecDeque for ropes under a certain size.

UTF-16 compatibility

SpiderMonkey expects text to be in UCS-2 format for the most part. The semantics of JavaScript strings are difficult to implement on UTF-8. This also applies to HTML parsing via document.write. Also, passing SpiderMonkey a string that isn't contiguous in memory will incur additional overhead and complexity, if not a full copy.

Solution: Use WTF-8 in parsing and in the DOM. Servo will convert to contiguous UTF-16 when necessary. The conversion can easily be parallelized, if we find a practical need to convert huge chunks of text all at once.

Source span information

Some html5ever API consumers want to know the originating location in the HTML source file(s) of each token or parse error. An example application would be a command-line HTML validator with diagnostic output similar to rustc's.

Solution: Accept some metadata along with each input string. The type of metadata is chosen by the API consumer; it defaults to (), which has size zero. For any non-inline string, we can provide the associated metadata as well as a byte offset.

More Repositories

1

servo

The Servo Browser Engine
23,804
star
2

pathfinder

A fast, practical GPU rasterizer for fonts and vector graphics
Rust
3,587
star
3

webrender

A GPU-based renderer for the web
Rust
3,097
star
4

html5ever

High-performance browser-grade HTML5 parser
Rust
2,089
star
5

rust-smallvec

"Small vector" optimization for Rust: store up to a small number of items on the stack
Rust
1,320
star
6

rust-url

URL parser for Rust
Rust
1,290
star
7

core-foundation-rs

Rust bindings to Core Foundation and other low level libraries on Mac OS X and iOS
Rust
992
star
8

ipc-channel

A multiprocess drop-in replacement for Rust channels
Rust
838
star
9

rust-cssparser

Rust implementation of CSS Syntax Level 3
Rust
732
star
10

font-kit

A cross-platform font loading library written in Rust
Rust
676
star
11

euclid

Geometry primitives (basic linear algebra) for Rust
Rust
409
star
12

gaol

Cross-platform application sandboxing for Rust
Rust
342
star
13

rust-fnv

Fowler–Noll–Vo hash function
Rust
332
star
14

rust-mozjs

DEPRECATED - moved to servo/mozjs instead.
Rust
293
star
15

cocoa-rs

DEPRECATED - Cocoa/Objective-C bindings for the Rust programming language
Rust
284
star
16

highfive

Github hooks to provide an encouraging atmosphere for new contributors
Python
255
star
17

project

A repo for the Servo Project
236
star
18

string-cache

String interning for Rust
Rust
193
star
19

uluru

A simple, fast, LRU cache implementation.
Rust
191
star
20

surfman

Accelerated offscreen graphics for WebGL
Rust
171
star
21

mozjs

Servo's SpiderMonkey fork
Rust
156
star
22

rust-webvr

UNMAINTAINED - WebVR API implementation for servo.
Rust
106
star
23

skia

Skia
C++
105
star
24

heapsize

In support of measuring heap allocations in Rust programs.
Rust
99
star
25

gleam

Generated OpenGL bindings and wrapper for Servo.
Rust
83
star
26

media

Rust
82
star
27

webxr

Bindings for WebXR
Rust
81
star
28

unicode-bidi

Implementation of the Unicode Bidirection Algorithm in Rust
Rust
75
star
29

rust-harfbuzz

Rust bindings to HarfBuzz
Rust
70
star
30

rust-stb-image

Rust bindings to the awesome stb_image library
C
65
star
31

stylo

Rust
59
star
32

rust-layers

A GPU-accelerated 2D animation library for Rust
Rust
58
star
33

servo-starters

Servo Starters is a list of easy tasks that are good for beginners to rust or servo.
JavaScript
58
star
34

saltfs

Salt Stack Filesystem
SaltStack
56
star
35

rust-azure

Rust bindings to mozilla-central's graphics abstraction layer
C++
56
star
36

rust-opengles

[UNMAINTAINED] OpenGL ES 2.0 bindings for Rust (see servo/gleam)
Rust
42
star
37

mozangle

Mozilla’s fork of Google ANGLE, repackaged as a Rust crate
C++
40
star
38

rust-selectors

CSS Selectors matching for Rust
38
star
39

smallbitvec

A growable bit-vector for Rust, optimized for size
Rust
37
star
40

pixman

C
30
star
41

rust-png

Rust bindings for libpng - UNMAINTAINED - DO NOT USE
C
27
star
42

rust-freetype

Rust bindings for FreeType.
Rust
25
star
43

rust-http-client

[UNMAINTAINED] old HTTP client library for Rust
C
24
star
44

rust-xlib

Rust bindings for xlib. UNMAINTAINED
Rust
22
star
45

core-graphics-rs

DEPRECATED - CoreGraphics bindings for Rust
Rust
21
star
46

rust-glut

[UNMAINTAINED] GLUT bindings for Rust
Rust
20
star
47

devices

Servo-specific APIs to access various devices
Rust
19
star
48

core-text-rs

DEPRECATED - Rust bindings for CoreText.
Rust
18
star
49

rustc-test

A fork of Rust’s `test` crate that doesn’t require unstable language features.
Rust
17
star
50

rust-quicksort

A Rust quicksort implementation for in-place sorting.
Rust
17
star
51

hyper_serde

Serde support for Hyper types
Rust
16
star
52

doc.servo.org

Documentation generated from Servo’s source code in its master branch
HTML
15
star
53

rust-fontconfig

Rust bindings for fontconfig.
Rust
15
star
54

book

The Servo Book
JavaScript
14
star
55

libfreetype2

C
13
star
56

servo.org_2014-2020

Main website for Servo.
JavaScript
13
star
57

osmesa-src

OSMesa source code and cargo build scripts to compile on Linux and Mac
C
12
star
58

homebrew-servo

Servo formulae repo for Homebrew
Ruby
11
star
59

nss

Network Security Services - UNMAINTAINED - DO NOT USE
C
11
star
60

plane-split

Plane splitting with euclid
Rust
11
star
61

rust-icu

Rust bindings to ICU (International Components for Unicode)
C++
11
star
62

libcss

[UNMAINTAINED] Servo fork of libcss from the NetSurf project
C
11
star
63

servo-warc-tests

Test Servo on Web Archive snapshots of real web sites
Shell
11
star
64

libfontconfig

Cargoified libfontconfig for Rust packages
C
10
star
65

libhubbub

[UNMAINTAINED] HTML parser library from the NetSurf project
C
10
star
66

cairo

C
10
star
67

libexpat

Not actively updating to new versions of expat. Pull requests to do so accepted.
C
10
star
68

blog.servo.org

The Servo blog
CSS
9
star
69

libgstreamer_android_gen

Scripts to generate Servo Media GStreamer dependencies on Android
Shell
9
star
70

io-surface-rs

Rust bindings to IOSurface.framework on Mac OS X and iOS
Rust
9
star
71

servo.org

Servo project website
HTML
8
star
72

futf

Handling fragments of UTF-8 in Rust
Rust
8
star
73

fontsan

Sanitiser for untrusted font files
C
8
star
74

sparkle

GL bindings for Servo's WebGL implementation (alternative to the `gleam` crate)
Rust
7
star
75

rust-glx

GLX 1.4 bindings for Linux
Rust
7
star
76

rust-hubbub

[UNMAINTAINED] Rust bindings to the hubbub HTML parser library from the NetSurf project
Rust
6
star
77

unicode-script

Rust
6
star
78

gecko-media

Firefox's media playback stack in a stand alone Rust crate
C
6
star
79

internal-wpt-dashboard

A simple wpt.fyi like dashboard to track progress of WPT scores for Servo's focus areas.
JavaScript
5
star
80

libparserutils

[UNMAINTAINED] libparserutils from the NetSurf project
C
5
star
81

rust-css

[UNMAINTAINED] obsolete CSS glue code for Servo
Rust
5
star
82

cgl-rs

Rust bindings for CGL on Mac
Rust
5
star
83

nspr

Netscape Portable Runtime
C
5
star
84

rust-egl

wrapper of EGL (maintenance changes only)
Rust
5
star
85

surfman-chains

An implementation of double-buffered swap chains for surfman
Rust
5
star
86

webrender_traits

DEPRECATED - now contained in https://github.com/servo/webrender/
Rust
4
star
87

download.servo.org

download.servo.org landing page
HTML
4
star
88

app_units

Rust
3
star
89

libpng

C
3
star
90

layout-zoo

A collection of spectacular and exotic CSS layout edge cases
3
star
91

servoexperiments.com

Experiments with Servo.
JavaScript
3
star
92

intermittent-tracker

A live database of intermittent test failures based on github's webhook notifications.
Python
3
star
93

servo-nightly-builds

Repository to host Servo nightly builds using Github Releases.
Shell
2
star
94

content-blocker

A library for parsing Safari-style content blocking lists and dynamically evaluating the rules against against requests.
Rust
2
star
95

servo-viewer

Simple GLUT-based viewer app for Servo
Rust
2
star
96

rust-netsurfcss

[UNMAINTAINED] Rust bindings to libcss
Rust
2
star
97

rust-cairo

Rust bindings for Cairo.
Rust
2
star
98

servo-with-rust-nightly

Detecting breakage early
2
star
99

sharegl

[UNMAINTAINED] A Rust library for cross-process OpenGL texture sharing
Rust
2
star
100

nelson

Newbors for Servo
Python
1
star