• Stars
    star
    100
  • Rank 340,703 (Top 7 %)
  • Language
    Rust
  • Created about 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Parallel iterator processing library for Rust

Parallel iterator processing library for Rust

See [IteratorExt] (latest IteratorExt on docs.rs) for supported operations.

Notable features

  • drop-in replacement for standard iterators(*)
    • preserves order
    • lazy, somewhat like single-threaded iterators
    • panic propagation
  • support for iterating over borrowed values using scoped threads
  • backpressure
  • profiling methods (useful for analyzing pipelined processing bottlenecks)

When to use and alternatives

This library is a good general purpose solution to adding multi-threaded processing to an existing iterator-based code. When you have a chain of iterator steps, and would like to process one or some of them in parallel to speed things up, this library goes a long way to make it as close to a drop-in replacement as possible in all aspects.

The implementation is based on spawning thread-pools of worker threads and sending them work using channels, then receiving and sorting the results to turn them into a normal iterator again.

Sending iterator items through channels is fast, but not free. Make sure to parallelize operations that are heavy enough to justify overhead of sending data through channels. E.g. operations involving IO or some CPU-heavy computation.

You can use cargo bench or view the /docs/bench-report/report/index.html locally for criterion.rs benchmark report, but as a rule of thumb, each call to function being parallized should take more than 200ns for the parallelization to outweight the overheads.

When you have a lot items already stored in a collection, that you want to "roll over and perform some simple computation" you probably want to use rayon instead. It's a library optimized for parallelizing processing of whole chunks of larger set of data, which minimizes any per-item overheads. A downside of that is that converting rayon's iterators back to ordered sequencial iterator is non-trivial.

Usage

Adding new ones based on the existing code should be relatively easy, so PRs are welcome.

In short, if you have:

# fn step_a(x: usize) -> usize {
#   x * 7
# }
# 
# fn filter_b(x: &usize) -> bool {
#   x % 2 == 0
# }
# 
# fn step_c(x: usize) -> usize {
#   x + 1
# }
assert_eq!(
  (0..10)
    .map(step_a)
    .filter(filter_b)
    .map(step_c).collect::<Vec<_>>(),
    vec![1, 15, 29, 43, 57]
);

You can change it to:

use pariter::IteratorExt as _;
# fn step_a(x: usize) -> usize {
#   x * 7
# }
# 
# fn filter_b(x: &usize) -> bool {
#   x % 2 == 0
# }
# 
# fn step_c(x: usize) -> usize {
#   x + 1
# }
assert_eq!(
  (0..10)
    .map(step_a)
    .filter(filter_b)
    .parallel_map(step_c).collect::<Vec<_>>(),
    vec![1, 15, 29, 43, 57]
);

and it will run faster (conditions apply), because step_c will run in parallel on multiple-threads.

Iterating over borrowed values

Hitting a borrowed value does not live long enough error? Looks like you are iterating over values containing borrowed references. Sending them over to different threads for processing could lead to memory unsafety issues. But no problem, we got you covered.

First, if the values you are iterating over can be cheaply cloned, just try adding .cloned() and turning them into owned values.

If you can't, you can use scoped-threads API from [crossbeam] crate:

use pariter::{IteratorExt as _, scope};
# fn step_a(x: &usize) -> usize {
#   *x * 7
# }
#
# fn filter_b(x: &usize) -> bool {
#   x % 2 == 0
# }
#
# fn step_c(x: usize) -> usize {
#   x + 1
# }
let v : Vec<_> = (0..10).collect();

scope(|scope| {
  assert_eq!(
    v
      .iter() // iterating over `&usize` now, `parallel_map` will not work
      .parallel_map_scoped(scope, step_a)
      .filter(filter_b)
      .map(step_c).collect::<Vec<_>>(),
      vec![1, 15, 29, 43, 57]
  );
});

// or:

assert_eq!(
  scope(|scope| {
  v
    .iter()
    .parallel_map_scoped(scope, step_a)
    .filter(filter_b)
    .map(step_c).collect::<Vec<_>>()}).expect("handle errors properly in production code"),
    vec![1, 15, 29, 43, 57]
);

The additional scope argument comes from [crossbeam::thread::scope] and is there to enforce memory-safety. Just wrap your iterator chain in a scope wrapper that does not outlive the borrowed value, and everything will work smoothly.

Customizing settings

If you need to change settings like buffer sizes and number of threads:

# use pariter::IteratorExt as _;
assert_eq!(
  (0..10)
    .map(|x| x + 1)
    .parallel_filter_custom(|o| o.threads(16), |x| *x == 5)
    .map(|x| x /2).collect::<Vec<_>>(),
    vec![2]
);

Status & plans

I keep needing this exact functionality, so I've cleaned up my ad-hoc code, put it in a proper library. I'm usually very busy, so if you want something added, please submit a PR.

I'm open to share/transfer ownership & maintenance into reputable hands.

More Repositories

1

rdedup

Data deduplication engine, supporting optional compression and public key encryption.
Rust
829
star
2

mioco.pre-0.9

Scalable, coroutine-based, asynchronous IO handling library for Rust programming language. (aka MIO COroutines).
Rust
457
star
3

breeze

An experimental, kakoune-inspired CLI-centric text/code editor with |-shaped cursor (in Rust)
Rust
202
star
4

rhex

ASCII terminal hexagonal map roguelike written in Rust
Rust
142
star
5

mioco

[no longer maintained] Scalable, coroutine-based, fibers/green-threads for Rust. (aka MIO COroutines).
Rust
141
star
6

rust-bitcoin-indexer

Powerful & versatile Bitcoin Indexer, in Rust
Rust
122
star
7

rustyhex

Simple roguelike written in Rust language.
Rust
90
star
8

async-readline

Asynchronous readline-like interface (PoC ATM)
Rust
75
star
9

titanos

Titanos an exercise in writing OS kernel in Rust programming language.
Rust
73
star
10

hex2d-rs

Helper library for working with 2d hex-grid maps
Rust
62
star
11

stpl

Super templates (html, etc.) with plain-Rust; no macros, no textfiles
Rust
37
star
12

sniper

Educational Rust implemenation of Auction Sniper from Growing Object-Oriented Software, Guided By Tests
Rust
36
star
13

tokio-fiber

Async fibers for Rust `futures` and `tokio` using coroutines. Attempt at a `mioco` replacement.
Rust
31
star
14

dotr

Very simple dotfile manager
Rust
26
star
15

text-minimap

Generate text minimap/preview using Braille Patterns
Rust
24
star
16

titanium-draft

Titanium is an exercise in writing OS kernel in Rust programming language.
Rust
24
star
17

overflow-proof

Monadic checked arithmetic for Rust
Rust
23
star
18

boulder-dash.rs

A remake of the classic Boulder Dash game in Rust, using Amethyst.rs engine
Rust
21
star
19

docker-source-checksum

Deterministic source-based docker image checksum
Rust
21
star
20

vim-smarttabs

Vim Smart Tabs
Vim Script
18
star
21

colerr

Colorize stderr
Rust
17
star
22

tagwiki

A wiki in which you link to pages by specifing hashtags they contain.
Rust
16
star
23

htmx-sorta

Rust + htmx + tailwind + nix + redb + twind demo web app
Rust
13
star
24

xmppconsole

Simple readline and libstrophe based CLI XMPP client.
C
12
star
25

rust-bin-template

Rust project template with CI-built releases
Shell
12
star
26

fs-dir-cache

A CLI tool for CIs and build scripts, making file system based caching easy and correct (locking, eviction, etc.)
Rust
11
star
27

atifand.sh

Automatic fan speed control for ATI video cards using catalyst driver under Linux.
Shell
9
star
28

cbuf-rs

Circular Buffer for Rust
Rust
8
star
29

recycle-rs

Simple allocated buffers reuse for Rust
Rust
7
star
30

ttcms

Tiny Trivial CMS (with Markdown)
PHP
7
star
31

mioecho

Rust mio echo server.
Rust
7
star
32

chain-monitor

web-UI based tool to monitor chain heights of various blockchains as reported by different sources
Rust
6
star
33

convi

Convi - convenient (but safe) conversion (From-like) traits for Rust
Rust
5
star
34

hyper-get

curl-like tool/library written in Rust using Hyper
Rust
4
star
35

dpcgoban

Simple and efficient J2ME application for playing Go.
Java
4
star
36

block-iter

Another experiment in Bitcoin indexing
Rust
4
star
37

rclist-rs

Rust library: `RcList` is read-only, append only list (log), that can share common tail (history) with other `RcList`-s.
Rust
4
star
38

node-tcp-echo-server

The simplest node tcp echo server.
JavaScript
3
star
39

brainwiki

Moved to `tagwiki`. A wiki for my brain
Rust
3
star
40

redb-bincode

A wrapper around `redb` that makes it store everything as `bincode`d
Rust
3
star
41

muttmailer

Script for managing mass mailing from the console.
Shell
2
star
42

titanium.rs

A library for writing embeddded (baremetal) software in Rust
Rust
2
star
43

ratelimit-gcra

GCRA rate limit algorithm extracted from `redis-cell`
Rust
2
star
44

rust-default

`use default::default;` for your Rust crate
Rust
2
star
45

dack

Dack is Ack(grep-like tool)-like tool.
D
2
star
46

hex2d-dpcext-rs

dpc's hacky extensions to hex2d-rs library
Rust
2
star
47

boardgametimer

Android Board Game Timer
Java
2
star
48

dedns

Dedns is a public censorship-resistant, tamper-proof, 0-setup identity system.
1
star
49

insideout

Wrap composed types inside-out (library for Rust)
Rust
1
star
50

rust-htmx-template

My simple template web app repo
Rust
1
star
51

todo

My tiny "productivity" system
Nix
1
star
52

flakes

My flake related utilities and templates
Nix
1
star
53

dpc

1
star
54

rdup-du

rdup backup disk usage estimator
Rust
1
star
55

livecd-editor

Simple Makefile based LiveCD .iso editor (for debian/ubuntu)
Makefile
1
star
56

crane-deps-rebuild-repro

You don't want to know
Nix
1
star