• Stars
    star
    237
  • Rank 166,261 (Top 4 %)
  • Language
    Rust
  • License
    MIT License
  • Created about 1 year ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Immutable strings, in Rust.

Immutable Strings

crates.io docs.rs

This crate offers a cheaply cloneable and sliceable UTF-8 string type. It is inspired by the bytes crate, which offers zero-copy byte slices, and the im crate which offers immutable copy-on-write data structures. It offers a standard-library String-compatible API.

Internally, the crate uses a standard library string stored in a smart pointer, and a range into that String. This allows for cheap zero-copy cloning and slicing of the string. This is especially useful for parsing operations, where a large string needs to be sliced into a lot of substrings.

TL;DR: This crate offers an ImString type that acts as a String (in that it can be modified and used in the same way), an Arc<String> (in that it is cheap to clone) and an &str (in that it is cheap to slice) all in one, owned type.

Diagram of ImString Internals

This crate offers a safe API that ensures that every string and every string slice is UTF-8 encoded. It does not allow slicing of strings within UTF-8 multibyte sequences. It offers try_* functions for every operation that can fail to avoid panics. It also uses extensive unit testing with a full test coverage to ensure that there is no unsoundness.

Features

Efficient Cloning: The crate's architecture enables low-cost (zero-copy) clone and slice creation, making it ideal for parsing strings that are widely shared.

Efficient Slicing: The crate's architecture enables low-cost (zero-copy) slice creation, making it ideal for parsing operations where one large input string is slices into many smaller strings.

Copy on Write: Despite being cheap to clone and slice, it allows for mutation using copy-on-write. For strings that are not shared, it has an optimisation to be able to mutate it in-place safely to avoid unnecessary copying.

Compatibility: The API is designed to closely resemble Rust's standard library String, facilitating smooth integration and being almost a drop-in replacement. It also integrates with many popular Rust crates, such as serde, peg and nom.

Generic over Storage: The crate is flexible in terms of how the data is stored. It allows for using Arc<String> for multithreaded applications and Rc<String> for single-threaded use, providing adaptability to different storage requirements and avoiding the need to pay for atomic operations when they are not needed.

Safety: The crate enforces that all strings and string slices are UTF-8 encoded. Any methods that might violate this are marked as unsafe. All methods that can fail have a try_* variant that will not panic. Use of safe functions cannot result in unsound behaviour.

Example

use imstr::ImString;

// Create new ImString, allocates data.
let mut string = ImString::from("Hello, World");

// Edit: happens in-place (because this is the only reference).
string.push_str("!");

// Clone: this is zero-copy.
let clone = string.clone();

// Slice: this is zero-copy.
let hello = string.slice(0..5);
assert_eq!(hello, "Hello");

// Slice: this is zero-copy.
let world = string.slice(7..12);
assert_eq!(world, "World");

// Here we have to copy only the part that the slice refers to so it can be modified.
let hello = hello + "!";
assert_eq!(hello, "Hello!");

Optional Features

Optional features that can be turned on using feature-flags.

Feature Description
serde Serialize and deserialize ImString fields as strings with the serde crate.
peg Use ImString as the data structure that is parsed with the peg crate. See peg-list.rs for an example.
nom Allow ImString to be used to build parsers with nom. See nom-json.rs for an example.

Similar

This is a comparison of this crate to other, similar crates. The comparison is made on these features:

  • Cheap Clone: is it a zero-copy operation to clone a string?
  • Cheap Slice πŸ•: is it possibly to cheaply slice a string?
  • Mutable: is it possible to modify strings?
  • Generic Storage: is it possible to swap out the storage mechanism?
  • String Compatible: is it compatible with String?

Here is the data, with links to the crates for further examination:

Crate Cheap Clone Cheap Slice Mutable Generic Storage String Compatible Notes
imstr βœ”οΈ βœ”οΈ βœ”οΈ βœ”οΈ βœ”οΈ This crate.
tendril βœ”οΈ βœ”οΈ βœ”οΈ βœ”οΈ ❌ Complex implementation. API not quite compatible with String, but otherwise closest to what this crate does.
immut_string βœ”οΈ ❌ 🟑 (no optimization) ❌ ❌ Simply a wrapper around Arc<String>.
immutable_string βœ”οΈ ❌ ❌ ❌ ❌ Wrapper around Arc<str>.
arccstr βœ”οΈ ❌ ❌ ❌ ❌ Not UTF-8 (Null-terminated C string). Hand-written Arc implementation.
implicit-clone βœ”οΈ ❌ ❌ 🟑 βœ”οΈ Immutable string library. Has sync and unsync variants.
semistr ❌ ❌ ❌ ❌ ❌ Stores short strings inline.
quetta βœ”οΈ βœ”οΈ ❌ ❌ ❌ Wrapper around Arc<String> that can be sliced.
bytesstr βœ”οΈ 🟑 ❌ ❌ ❌ Wrapper around Bytes. Cannot be directly sliced.
fast-str βœ”οΈ ❌ ❌ ❌ ❌ Looks like there could be some unsafety.
flexstr βœ”οΈ ❌ ❌ βœ”οΈ ❌
bytestring βœ”οΈ 🟑 ❌ ❌ ❌ Wrapper around Bytes. Used by actix. Can be indirectly sliced using slice_ref().
arcstr βœ”οΈ βœ”οΈ ❌ ❌ ❌ Can store string literal as &'static str.
cowstr βœ”οΈ ❌ βœ”οΈ ❌ ❌ Reimplements Arc, custom allocation strategy.
strck ❌ ❌ ❌ βœ”οΈ ❌ Typechecked string library.

License

MIT, see LICENSE.md.

More Repositories

1

PiL3

My solutions to the exercises from the book "Programming in Lua 3" by Roberto Ierusalimschy
Lua
129
star
2

diff.rs

Web application to render a diff between Rust crate versions. Implemented in Yew, runs fully in the browser as WebAssembly.
CSS
85
star
3

dnsfun

DNS Server written in Rust for fun, see https://dev.to/xfbs/writing-a-dns-server-in-rust-1gpn
Rust
41
star
4

awesome

my personal collection of awesome projects, links, books.
20
star
5

docker-openpcdet

Docker image for OpenPCDet
Dockerfile
18
star
6

exploit-courses

https://exploit.courses/
Shell
16
star
7

macrodb

Macro-generated in-memory type-safe relational database for Rust.
Rust
9
star
8

passgen

Generate random sequences from a regex-like pattern.
C
6
star
9

ddca_solutions

My solutions to the exercises in "Digital Design and Computer Architecture"
TeX
4
star
10

restless

REST API helper traits and clients.
Rust
4
star
11

cindy

Image tagging and labelling web application written in Rust (Yew + Axum + WebAssembly)
Rust
4
star
12

unicode-cli

Rust
4
star
13

dotfiles

my dotfiles.
Vim Script
3
star
14

tupperware

Generic storage for your types.
Rust
3
star
15

wasm-cache

In-memory request cache for Rust frontend WASM applications
Rust
2
star
16

svgcurves

Tool to convert SVG quadratic and bΓ©zier curves.
Rust
2
star
17

pointcloud-filter

Multithreaded C++ script to filter pointcloud coordinates to a given range of angles
C++
2
star
18

euler

solutions to project euler problems in various languages
Makefile
1
star
19

kattis

my solutions to kattis problems, see https://open.kattis.com.
Ruby
1
star
20

litmus

Literate Markdown β€” code like Donald Knuth!
Crystal
1
star
21

xfbs.github.io

https://blog.xfbs.net/
HTML
1
star
22

htb

my own hack the box progress
Python
1
star
23

opheliaos

ophelia's os
Assembly
1
star
24

rust-memory-table

Rust in-memory database table implementation
Rust
1
star
25

lobsters

mirror of
Rust
1
star
26

advent2

advent(2) β€” the system call advent calendar. https://osg.tuhh.de/Advent/
C
1
star
27

libigma

iterative generated minimal art
Zig
1
star
28

montgomery

montgomery multiplication in C
C
1
star
29

livetv

bored? watch some german tv.
1
star
30

jpegcheck

tool to check if seabios can read a given JPG image.
C
1
star
31

clists

easy to use and simple linked list implementation in C
C
1
star
32

cloudfs

Distributed file system
Rust
1
star
33

fsdoc

file system docs
TeX
1
star
34

pocorgtfo

PoC||GTFO
1
star