• Stars
    star
    214
  • Rank 184,678 (Top 4 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created about 8 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Counting occurrences of a given byte or UTF-8 characters in a slice of memory – fast

bytecount

Counting bytes really fast

Continuous integration Windows build status Current Version License: Apache 2.0/MIT

This uses the "hyperscreamingcount" algorithm by Joshua Landau to count bytes faster than anything else. The newlinebench repository has further benchmarks for old versions of this repository.

To use bytecount in your crate, if you have cargo-edit, just type cargo add bytecount in a terminal with the crate root as the current path. Otherwise you can manually edit your Cargo.toml to add bytecount = 0.6.3 to your [dependencies] section.

In your crate root (lib.rs or main.rs, depending on if you are writing a library or application), add extern crate bytecount;. Now you can simply use bytecount::count as follows:

extern crate bytecount;

fn main() {
    let mytext = "some potentially large text, perhaps read from disk?";
    let spaces = bytecount::count(mytext.as_bytes(), b' ');
    ..
}

bytecount supports two features to make use of modern CPU's features to speed up counting considerably. To allow your users to use them, add the following to your Cargo.toml:

[features]
runtime-dispatch-simd = ["bytecount/runtime-dispatch-simd"]
generic-simd = ["bytecount/generic-simd"]

The first, runtime-dispatch-simd, enables detection of SIMD capabilities at runtime, which allows using the SSE2 and AVX2 codepaths, but cannot be used with no_std.

Your users can then compile with runtime dispatch using:

cargo build --release --features runtime-dispatch-simd

The second, generic-simd, uses packed_simd to provide a fast architecture-agnostic SIMD codepath, but requires running on nightly.

Your users can compile with this codepath using:

cargo build --release --features generic-simd

Building for a more specific architecture will also improve performance. You can do this with

RUSTFLAGS="-C target-cpu=native" cargo build --release

The scalar algorithm is explained in depth here.

Note: Versions until 0.4.0 worked with Rust as of 1.20.0. Version 0.5.0 until 0.6.0 requires Rust 1.26 or later, and at least 1.27.2 to use SIMD. Versions from 0.6.0 require Rust 1.32.0 or later.

License

Licensed under either of at your discretion:

More Repositories

1

flame

An intrusive flamegraph profiling tool for rust.
Rust
672
star
2

mutagen

Breaking your Rust code for fun and profit
Rust
621
star
3

flamer

A compiler plugin to insert flame calls
Rust
364
star
4

momo

A Rust proc_macro_attribute to outline conversions from generic functions
Rust
239
star
5

stdx-dev

Rust's missing development batteries
120
star
6

metacollect

A lint to collect some crate metadata
Rust
115
star
7

overflower

A Rust compiler plugin and support library to annotate overflow behavior
Rust
103
star
8

compact_arena

A crate with indexed arenas with small memory footprint
Rust
76
star
9

optional

A small crate to provide space-efficient Option<_> replacements
Rust
35
star
10

serdebench

Rust
30
star
11

newlinebench

Rust
21
star
12

partition

partition slices in-place by a predicate
Rust
14
star
13

compressbench

A benchmark of Rust compression libraries
Rust
9
star
14

smallvectune

Rust
9
star
15

llogiq.github.io

My github page
HTML
8
star
16

pathsep

A os agnostic way to get a path separator in macros
Rust
7
star
17

extra_lints

more lints for rust (now subsumed in rust-clippy)
Rust
7
star
18

arraymap

Adds a trait to map functions over arrays
Rust
6
star
19

bsdiff-rs

A Rust BSDiff port
Rust
4
star
20

arraymapbench

A benchmark of various map methods
Rust
2
star
21

twirer

A short program I use to collect and filter the core changes for This Week In Rust
Rust
2
star
22

picnic

Your Picnic Is On Fire!
Rust
1
star
23

openpgp

Rust
1
star
24

rangeset

(WIP) a RangeSet implementation
Rust
1
star
25

typetree

a data structure within Rust's type system
Rust
1
star
26

rust-stockfighter

Simple Rust Wrapper for stockfigher
Rust
1
star
27

ternary

Kleene logic in Rust's type system
Rust
1
star
28

lib_json

Allocationless Json Parsing
Rust
1
star
29

wom

Write-Only Memory for Rust
Rust
1
star