There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
Data Processing benchmark featuring Rust, Go, Swift, Zig, Julia etc.
Problem:
Given a list of posts, compute the top 5 related posts for each post based on the number of shared tags.
Steps
Read the posts JSON file.
Iterate over the posts and populate a map containing: tag -> List<int>, with the int representing the post index of each post with that tag.
Iterate over the posts and for each post:
Create a map: PostIndex -> int to track the number of shared tags
For each tag, Iterate over the posts that have that tag
For each post, increment the shared tag count in the map.
Sort the related posts by the number of shared tags.
Write the top 5 related posts for each post to a new JSON file.
Run Benchmark
./run.sh go | rust | python | all
# windows (powershell)
./run.ps1 go | rust | python | all
# OR
pwsh ./run.ps1 go | rust | python | all
# Docker (check the dockerfile for available variables)
docker build -t databench .# OR
docker pull ghcr.io/jinyus/databench:latest
# THEN
docker run -e TEST_NAME=all -it --rm databench
Rules
No:
FFI (including assembly inlining)
Unsafe code blocks
Custom benchmarking
Disabling runtime checks (bounds etc)
Specific hardware targeting
SIMD for single threaded solutions
Hardcoding number of posts
Lazy evaluation (Unless results are computed at runtime and timed)