Stretto
Stretto is a pure Rust implementation for https://github.com/dgraph-io/ristretto.
A high performance thread-safe memory-bound Rust cache.
English | 简体中文
Features
- Internal Mutability - Do not need to use
Arc<RwLock<Cache<...>>
for concurrent code, you just needCache<...>
orAsyncCache<...>
- Sync and Async - Stretto support sync and runtime agnostic async.
- In sync, Cache starts two extra OS level threads. One is policy thread, the other is writing thread.
- In async, AsyncCache starts two extra green threads. One is policy thread, the other is writing thread.
- Store policy Stretto only store the value, which means the cache does not store the key.
- High Hit Ratios - with Dgraph's developers unique admission/eviction policy pairing, Ristretto's performance is best in class.
- Eviction: SampledLFU - on par with exact LRU and better performance on Search and Database traces.
- Admission: TinyLFU - extra performance with little memory overhead (12 bits per counter).
- Fast Throughput - use a variety of techniques for managing contention and the result is excellent throughput.
- Cost-Based Eviction - any large new item deemed valuable can evict multiple smaller items (cost could be anything).
- Fully Concurrent - you can use as many threads as you want with little throughput degradation.
- Metrics - optional performance metrics for throughput, hit ratios, and other stats.
- Simple API - just figure out your ideal
CacheBuilder
/AsyncCacheBuilder
values and you're off and running.
Table of Contents
Installation
- Use Cache.
[dependencies]
stretto = "0.8"
or
[dependencies]
stretto = { version = "0.8", features = ["sync"] }
- Use AsyncCache
[dependencies]
stretto = { version = "0.8", features = ["async"] }
- Use both Cache and AsyncCache
[dependencies]
stretto = { version = "0.8", features = ["full"] }
Related
If you want some basic caches implementation(no_std), please see https://crates.io/crates/caches.
Usage
Example
Sync
use std::time::Duration;
use stretto::Cache;
fn main() {
let c = Cache::new(12960, 1e6 as i64).unwrap();
// set a value with a cost of 1
c.insert("a", "a", 1);
// set a value with a cost of 1 and ttl
c.insert_with_ttl("b", "b", 1, Duration::from_secs(3));
// wait for value to pass through buffers
c.wait().unwrap();
// when we get the value, we will get a ValueRef, which contains a RwLockReadGuard
// so when we finish use this value, we must release the ValueRef
let v = c.get(&"a").unwrap();
assert_eq!(v.value(), &"a");
v.release();
// lock will be auto released when out of scope
{
// when we get the value, we will get a ValueRef, which contains a RwLockWriteGuard
// so when we finish use this value, we must release the ValueRefMut
let mut v = c.get_mut(&"a").unwrap();
v.write("aa");
assert_eq!(v.value(), &"aa");
// release the value
}
// if you just want to do one operation
let v = c.get_mut(&"a").unwrap();
v.write_once("aaa");
let v = c.get(&"a").unwrap();
assert_eq!(v.value(), &"aaa");
v.release();
// clear the cache
c.clear().unwrap();
// wait all the operations are finished
c.wait().unwrap();
assert!(c.get(&"a").is_none());
}
Async
Stretto support runtime agnostic AsyncCache
, the only thing you need to do is passing a spawner
when building the AsyncCache
.
use std::time::Duration;
use stretto::AsyncCache;
#[tokio::main]
async fn main() {
// In this example, we use tokio runtime, so we pass tokio::spawn when constructing AsyncCache
let c: AsyncCache<&str, &str> = AsyncCache::new(12960, 1e6 as i64, tokio::spawn).unwrap();
// set a value with a cost of 1
c.insert("a", "a", 1).await;
// set a value with a cost of 1 and ttl
c.insert_with_ttl("b", "b", 1, Duration::from_secs(3)).await;
// wait for value to pass through buffers
c.wait().await.unwrap();
// when we get the value, we will get a ValueRef, which contains a RwLockReadGuard
// so when we finish use this value, we must release the ValueRef
let v = c.get(&"a").unwrap();
assert_eq!(v.value(), &"a");
// release the value
v.release(); // or drop(v)
// lock will be auto released when out of scope
{
// when we get the value, we will get a ValueRef, which contains a RwLockWriteGuard
// so when we finish use this value, we must release the ValueRefMut
let mut v = c.get_mut(&"a").unwrap();
v.write("aa");
assert_eq!(v.value(), &"aa");
// release the value
}
// if you just want to do one operation
let v = c.get_mut(&"a").unwrap();
v.write_once("aaa");
let v = c.get(&"a").unwrap();
println!("{}", v);
assert_eq!(v.value(), &"aaa");
v.release();
// clear the cache
c.clear().await.unwrap();
// wait all the operations are finished
c.wait().await.unwrap();
assert!(c.get(&"a").is_none());
}
Config
The CacheBuilder
struct is used when creating Cache instances if you want to customize the Cache settings.
num_counters
num_counters
is the number of 4-bit access counters to keep for admission and eviction. Dgraph's developers have seen good performance in setting this to 10x the number of items you expect to keep in the cache when full.
For example, if you expect each item to have a cost of 1 and max_cost
is 100, set num_counters
to 1,000. Or, if you use variable cost values but expect the cache to hold around 10,000 items when full, set num_counters to 100,000. The important thing is the number of unique items in the full cache, not necessarily the max_cost
value.
max_cost
max_cost
is how eviction decisions are made. For example, if max_cost is 100 and a new item with a cost of 1 increases total cache cost to 101, 1 item will be evicted.
max_cost
can also be used to denote the max size in bytes. For example, if max_cost is 1,000,000 (1MB) and the cache is full with 1,000 1KB items, a new item (that's accepted) would cause 5 1KB items to be evicted.
max_cost
could be anything as long as it matches how you're using the cost values when calling insert
.
key_builder
pub trait KeyBuilder {
type Key: Hash + Eq + ?Sized;
/// hash_index is used to hash the key to u64
fn hash_index<Q>(&self, key: &Q) -> u64
where
Self::Key: core::borrow::Borrow<Q>,
Q: Hash + Eq + ?Sized;
/// if you want a 128bit hashes, you should implement this method,
/// or leave this method return 0
fn hash_conflict<Q>(&self, key: &Q) -> u64
where
Self::Key: core::borrow::Borrow<Q>,
Q: Hash + Eq + ?Sized;
{ 0 }
/// build the key to 128bit hashes.
fn build_key<Q>(&self, k: &Q) -> (u64, u64)
where
Self::Key: core::borrow::Borrow<Q>,
Q: Hash + Eq + ?Sized;
{
(self.hash_index(k), self.hash_conflict(k))
}
}
KeyBuilder is the hashing algorithm used for every key. In Stretto, the Cache will never store the real key.
The key will be processed by KeyBuilder
. Stretto has two default built-in key builder,
one is TransparentKeyBuilder
, the other is DefaultKeyBuilder
. If your key implements TransparentKey
trait,
you can use TransparentKeyBuilder
which is faster than DefaultKeyBuilder
. Otherwise, you should use DefaultKeyBuilder
You can also write your own key builder for the Cache, by implementing KeyBuilder
trait.
Note that if you want 128bit hashes you should use the full (u64, u64)
,
otherwise just fill the u64
at the 0
position, and it will behave like
any 64bit hash.
buffer_size
buffer_size
is the size of the insert buffers. The Dgraph's developers find that 32 * 1024 gives a good performance.
If for some reason you see insert performance decreasing with lots of contention (you shouldn't), try increasing this value in increments of 32 * 1024. This is a fine-tuning mechanism and you probably won't have to touch this.
metrics
Metrics is true when you want real-time logging of a variety of stats. The reason this is a CacheBuilder flag is because there's a 10% throughput performance overhead.
ignore_internal_cost
Set to true indicates to the cache that the cost of internally storing the value should be ignored. This is useful when the cost passed to set is not using bytes as units. Keep in mind that setting this to true will increase the memory usage.
cleanup_duration
The Cache will cleanup the expired values every 500ms by default.
update_validator
pub trait UpdateValidator: Send + Sync + 'static {
type Value: Send + Sync + 'static;
/// should_update is called when a value already exists in cache and is being updated.
fn should_update(&self, prev: &Self::Value, curr: &Self::Value) -> bool;
}
By default, the Cache will always update the value if the value already exists in the cache, this trait is for you to check if the value should be updated.
callback
pub trait CacheCallback: Send + Sync + 'static {
type Value: Send + Sync + 'static;
/// on_exit is called whenever a value is removed from cache. This can be
/// used to do manual memory deallocation. Would also be called on eviction
/// and rejection of the value.
fn on_exit(&self, val: Option<Self::Value>);
/// on_evict is called for every eviction and passes the hashed key, value,
/// and cost to the function.
fn on_evict(&self, item: Item<Self::Value>) {
self.on_exit(item.val)
}
/// on_reject is called for every rejection done via the policy.
fn on_reject(&self, item: Item<Self::Value>) {
self.on_exit(item.val)
}
}
CacheCallback is for customize some extra operations on values when related event happens.
coster
pub trait Coster: Send + Sync + 'static {
type Value: Send + Sync + 'static;
/// cost evaluates a value and outputs a corresponding cost. This function
/// is ran after insert is called for a new item or an item update with a cost
/// param of 0.
fn cost(&self, val: &Self::Value) -> i64;
}
Cost is a trait you can pass to the CacheBuilder in order to evaluate
item cost at runtime, and only for the insert
calls that aren't dropped (this is
useful if calculating item cost is particularly expensive, and you don't want to
waste time on items that will be dropped anyways).
To signal to Stretto that you'd like to use this Coster trait:
- Set the Coster field to your own Coster implementation.
- When calling
insert
for new items or item updates, use acost
of 0.
hasher
The hasher for the Cache, default is SeaHasher.
Acknowledgements
- Thanks Dgraph's developers for providing amazing Go version Ristretto implementation.
Licensed under either of LicenseApache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.