• Stars
    star
    226
  • Rank 176,514 (Top 4 %)
  • Language
    Rust
  • License
    MIT License
  • Created about 8 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A fast BVH using SAH in rust

bvh

CI Docs Status codecov license Crates.io Crates.io

A crate which exports rays, axis-aligned bounding boxes, and binary bounding volume hierarchies.

About

This crate can be used for applications which contain intersection computations of rays with primitives. For this purpose a binary tree BVH (Bounding Volume Hierarchy) is of great use if the scene which the ray traverses contains a huge number of primitives. With a BVH the intersection test complexity is reduced from O(n) to O(log2(n)) at the cost of building the BVH once in advance. This technique is especially useful in ray/path tracers. For use in a shader this module also exports a flattening procedure, which allows for iterative traversal of the BVH.

Example

use bvh::aabb::{Aabb, Bounded};
use bvh::bounding_hierarchy::BHShape;
use bvh::bvh::Bvh;
use bvh::ray::Ray;
use nalgebra::{Point3, Vector3};

let origin = Point3::new(0.0,0.0,0.0);
let direction = Vector3::new(1.0,0.0,0.0);
let ray = Ray::new(origin, direction);

struct Sphere {
    position: Point3<f32>,
    radius: f32,
    node_index: usize,
}

impl Bounded<f32, 3> for Sphere {
    fn aabb(&self) -> Aabb<f32, 3> {
        let half_size = Vector3::new(self.radius, self.radius, self.radius);
        let min = self.position - half_size;
        let max = self.position + half_size;
        Aabb::with_bounds(min, max)
    }
}

impl BHShape<f32, 3> for Sphere {
    fn set_bh_node_index(&mut self, index: usize) {
        self.node_index = index;
    }
    fn bh_node_index(&self) -> usize {
        self.node_index
    }
}

let mut spheres = Vec::new();
for i in 0..1000u32 {
    let position = Point3::new(i as f32, i as f32, i as f32);
    let radius = (i % 10) as f32 + 1.0;
    spheres.push(Sphere {
        position: position,
        radius: radius,
        node_index: 0,
    });
}

let bvh = Bvh::build(&mut spheres);
let hit_sphere_aabbs = bvh.traverse(&ray, &spheres);

Explicit SIMD

This crate features some manually written SIMD instructions, currently only for the x86_64 architecture. While nalgebra provides us with generic SIMD optimization (and it does a great job for the most part) - some important functions, such as ray-aabb-intersection have been optimized by hand.

The currently optimized intersections for ray-aabb are: Type: f32, Dimension: 2,3,4 Type: f64, Dimension: 2,3,4

To enable these optimziations, you must build with the nightly toolchain and enable the simd flag.

Optimization

This crate provides BVH updating, which is also called optimization. With BVH optimization you can mutate the shapes on which the BVH is built and update the tree accordingly without rebuilding it completely. This method is very useful when there are only very few changes to a huge scene. When the major part of the scene is static, it is faster to update the tree, instead of rebuilding it from scratch.

Drawbacks

First of all, optimizing is not helpful if more than half of the scene is not static. This is due to how optimizing takes place: Given a set of indices of all shapes which have changed, the optimize procedure tries to rotate fixed constellations in search for a better surface area heuristic (SAH) value. This is done recursively from bottom to top while fixing the AABBs in the inner nodes of the BVH. Which is why it is inefficient to update the BVH in comparison to rebuilding, when a lot of shapes have moved.

Another problem with updated BVHs is, that the resulting BVH is not optimal. Assume that the scene is composed of two major groups separated by a large gap. When one shape moves from one group to another, the updating procedure will not be able to find a sequence of bottom-up rotations which inserts the shape deeply into the other branch.

The following benchmarks are run with two different datasets:

  • A randomly generated scene with unit sized cubes containing a total of (1200, 12000, and 120000 triangles).
  • Sponza, a popular scene for benchmarking graphics engines.

Intersection via traversal of the list of triangles

test testbase::bench_intersect_120k_triangles_list                       ... bench:     653,607 ns/iter (+/- 18,796)
test testbase::bench_intersect_sponza_list                               ... bench:     542,108 ns/iter (+/- 8,705)

This is the most naive approach to intersecting a scene with a ray. It defines the baseline.

Intersection via traversal of the list of triangles with AABB checks

test testbase::bench_intersect_120k_triangles_list_aabb                  ... bench:     229,088 ns/iter (+/- 6,727)
test testbase::bench_intersect_sponza_list_aabb                          ... bench:     107,514 ns/iter (+/- 1,511)

AABB checks are cheap, compared to triangle-intersection algorithms. Therefore, preceeding AABB checks increase intersection speed by filtering negative results a lot faster.

Build of a BVH from scratch

test flat_bvh::bench::bench_build_1200_triangles_flat_bvh                ... bench:     538,474 ns/iter (+/- 4,001)
test flat_bvh::bench::bench_build_12k_triangles_flat_bvh                 ... bench:   6,373,530 ns/iter (+/- 37,217)
test flat_bvh::bench::bench_build_120k_triangles_flat_bvh                ... bench:  74,005,254 ns/iter (+/- 564,271)
test bvh::bvh::bench::bench_build_1200_triangles_bvh                     ... bench:     510,408 ns/iter (+/- 5,240)
test bvh::bvh::bench::bench_build_12k_triangles_bvh                      ... bench:   5,982,294 ns/iter (+/- 31,480)
test bvh::bvh::bench::bench_build_120k_triangles_bvh                     ... bench:  70,182,316 ns/iter (+/- 1,266,142)
test bvh::bvh::bench::bench_build_sponza_bvh                             ... bench:  46,802,305 ns/iter (+/- 184,644)

Flatten a BVH

test flat_bvh::bench::bench_flatten_120k_triangles_bvh                   ... bench:   3,891,505 ns/iter (+/- 42,360)

As you can see, building a BVH takes a long time. Building a BVH is only useful if the number of intersections performed on the scene exceeds the build duration. This is the case in applications such as ray and path tracing, where the minimum number of intersections is 1280 * 720 for an HD image.

Intersection via BVH traversal

test flat_bvh::bench::bench_intersect_1200_triangles_flat_bvh            ... bench:         168 ns/iter (+/- 2)
test flat_bvh::bench::bench_intersect_12k_triangles_flat_bvh             ... bench:         397 ns/iter (+/- 4)
test flat_bvh::bench::bench_intersect_120k_triangles_flat_bvh            ... bench:         913 ns/iter (+/- 11)
test bvh::bvh::bench::bench_intersect_1200_triangles_bvh                 ... bench:         157 ns/iter (+/- 2)
test bvh::bvh::bench::bench_intersect_12k_triangles_bvh                  ... bench:         384 ns/iter (+/- 6)
test bvh::bvh::bench::bench_intersect_120k_triangles_bvh                 ... bench:         858 ns/iter (+/- 14)
test bvh::bvh::bench::bench_intersect_sponza_bvh                         ... bench:       1,428 ns/iter (+/- 17)
test ray::bench::bench_intersects_aabb                                   ... bench:      34,920 ns/iter (+/- 240)
test ray::bench::bench_intersects_aabb_branchless                        ... bench:      34,867 ns/iter (+/- 214)
test ray::bench::bench_intersects_aabb_naive                             ... bench:      34,958 ns/iter (+/- 259)

These performance measurements show that traversing a BVH is much faster than traversing a list.

Optimization

Warning The optimization benchmarks here are no longer current, and perform around 1/4 as fast as the current implementation. This section needs to be revisited.

The benchmarks for how long it takes to update the scene also contain a randomization process which takes some time.

test bvh::optimization::bench::bench_optimize_bvh_120k_00p               ... bench:   1,123,662 ns/iter (+/- 3,797)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p               ... bench:   6,584,151 ns/iter (+/- 1,375,770)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p               ... bench:  39,725,368 ns/iter (+/- 12,175,627)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p               ... bench: 167,396,675 ns/iter (+/- 55,555,366)
test bvh::optimization::bench::bench_randomize_120k_50p                  ... bench:   3,397,073 ns/iter (+/- 14,335)

This is the place where you have to differentiate between rebuilding the tree from scratch or trying to optimize the old one. These tests show the impact of moving around a particular percentage of shapes (10p => 10%). It is important to note that the randomization process here moves triangles around indiscriminately. This will also lead to cases where the BVH would have to be restructured completely.

Intersection after the optimization

These intersection tests are grouped by dataset and by the BVH generation method.

  • _after_optimize uses a BVH which was kept up to date with calls to optimize, while
  • _with_rebuild uses the same triangle data as _after_optimize, but constructs a BVH from scratch.

120K Triangles

test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p   ... bench:         857 ns/iter (+/- 8)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p   ... bench:     139,767 ns/iter (+/- 10,031)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p   ... bench:   1,307,082 ns/iter (+/- 315,000)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p   ... bench:   2,098,761 ns/iter (+/- 568,199)

test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p     ... bench:         858 ns/iter (+/- 8)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p     ... bench:         917 ns/iter (+/- 9)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p     ... bench:       1,749 ns/iter (+/- 21)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p     ... bench:       1,988 ns/iter (+/- 35)

Sponza

test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p ... bench:       1,433 ns/iter (+/- 17)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p ... bench:       2,745 ns/iter (+/- 56)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p ... bench:       3,729 ns/iter (+/- 97)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p ... bench:       5,598 ns/iter (+/- 199)

test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p   ... bench:       1,426 ns/iter (+/- 45)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p   ... bench:       1,540 ns/iter (+/- 29)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p   ... bench:       1,880 ns/iter (+/- 41)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p   ... bench:       2,375 ns/iter (+/- 54)

This set of tests shows the impact of randomly moving triangles around and producing degenerated trees. The 120K Triangles dataset has been updated randomly. The Sponza scene was updated using a method which has a maximum offset distance for shapes. This simulates a more realistic scenario.

We also see that the Sponza scene by itself contains some structures which can be tightly wrapped in a BVH. By mowing those structures around we destroy the locality of the triangle groups which leads to more branches in the BVH requiring a check, thus leading to a higher intersection duration.

Running the benchmark suite

The benchmark suite uses features from the test crate and therefore cannot be run on stable rust. Using a nightly toolchain, run cargo bench --features bench.

More Repositories

1

genact

πŸŒ€ A nonsense activity generator
Rust
7,449
star
2

miniserve

🌟 For when you really just want to serve some files over HTTP right now!
Rust
4,653
star
3

rofi-calc

πŸ–© Do live calculations in rofi!
C
756
star
4

upload-release-action

Upload files to a GitHub release
TypeScript
429
star
5

cargo-profiler

Cargo subcommand to profile binaries
Rust
426
star
6

rust-web-boilerplate

Rust web template for modern web backend applications
Rust
286
star
7

wmfocus

Visually focus windows by label
Rust
195
star
8

flamejam

A generic game jam application with ratings and comments using Flask
HTML
140
star
9

memefs

Mount your memes using FUSE
Rust
131
star
10

glsl-language-server

Language server implementation for GLSL
C++
104
star
11

proxyboi

A super simple reverse proxy with TLS support
Rust
81
star
12

dwarf_fortress_unfuck

Unfucking Dwarf Fortress
C++
80
star
13

dummyhttp

Super simple HTTP server that replies a fixed body with a fixed response code
Rust
56
star
14

pseudoform

Pseudoform is a community-driven collaboration project that aims to create an involving and brain-melting first-person puzzle-solving game.
C++
38
star
15

trac0r

A fast real time physically based renderer
C++
29
star
16

upx-action

Strips and runs upx on binaries
JavaScript
25
star
17

mt940-rs

A MT940 parser in Rust
Rust
22
star
18

python-web-boilerplate

Python web template for modern web backend applications
Python
21
star
19

keycloak-http-webhook-provider

A Keycloak provider that posts events to a URL via HTTP POST as JSON
Java
20
star
20

derp

The derp game engine in D
D
20
star
21

gamejam

This is a gamejam project for all kinds of jams.
Lua
19
star
22

wiresmith

Auto-config WireGuard clients into a mesh
Rust
17
star
23

lglive

live.linuX-gamers live gaming distro
Shell
16
star
24

site24x7_exporter

A Prometheus compatible exporter for site24x7.com
Rust
13
star
25

crabcluster

A simple integrated container orchestration solution
Rust
12
star
26

pytest-colordots

Colorizes the progress indicators
Python
12
star
27

dotfiles

Various dotfiles from my machines
Vim Script
12
star
28

minimal-examples

A bunch of small standalone programs
C++
11
star
29

proby

πŸ“‘ Check whether hosts are reachable on certain ports and return result on HTTP
Rust
10
star
30

playgrounds

Just some playing around with physics.
C++
9
star
31

Spacescape

A space skybox tool using Ogre3D
C++
8
star
32

fints-institute-db

A library and CLI tool to access FinTS access information for many German banks
Rust
6
star
33

talks

Talks for conferences and such
HTML
5
star
34

fints-rs

A compliant FinTS implementation
Rust
5
star
35

vulkanology

Test Vulkan compute shaders using Rust
Rust
5
star
36

qponies

Desktop ponies using qml
C++
5
star
37

NoisyHunter

Implementation of http://www.squidi.net/three/entry.php?id=85
C++
5
star
38

Pseudoform-2

C++
4
star
39

piplayer

Receive a command, play a song
Rust
4
star
40

bankthing

πŸ’° A thing that does stuff with banks
Python
3
star
41

instacheer

Instant cheer for your desktop!
C++
3
star
42

benchy

FOSS cross-platform desktop benchmarking tool
C++
3
star
43

MegaGong

School gong for my class
C++
2
star
44

ilswlol

Wer weiss?
Python
2
star
45

arkenon

Stuff with spaceships
C++
2
star
46

pong4p

4-player multiplayer pong
Python
2
star
47

minitraderoute

A Mini Metro-inspired space trading game
Rust
2
star
48

txt2vid

A text to video renderer
2
star
49

windowcrap

Experimental crap for your desktop!
C++
2
star
50

esp32-toys

Just playing around with ESP32
Python
2
star
51

ssl-proxy-docker

Dockerized version of https://github.com/suyashkumar/ssl-proxy
Dockerfile
2
star
52

rust-web-experiments

Various experiments with rust in the web
Rust
2
star
53

overpower

CLI tool to benchmark web servers with nice output
Rust
2
star
54

rust-timescale-sqlx-rocket

A little integration test of Rust, TimescaleDB, sqlx, and Rocket
Rust
2
star
55

lolpi

A simple implementation of Bellard's formula for calculating PI.
C++
2
star
56

uefi-playground

Experiments with UEFI
Makefile
2
star
57

innojam13

IGJam #13
Rust
1
star
58

cargo-sweb

Simple and convenient cargo subcommand to develop web applications
1
star
59

gameservers

Instant game servers for various games
Shell
1
star
60

taxi-simulation

Schizl
Rust
1
star
61

innojam8

InnoGames Game Jam 8
JavaScript
1
star
62

infinerator

Image generator to generate every possible combination of a given resolution and color depth
C++
1
star
63

pink-fluffy-unicorns

Interactive Visual Computing using POVRay
TeX
1
star
64

qemu-espressif-docker

A container image containing espressif-qemu
Dockerfile
1
star
65

caca

Copy And Convert Audio
Python
1
star
66

fredjam2018

Rust
1
star
67

svenstaro

My GitHub profile page
1
star
68

gamedev-starter-packs

Starter packs for various languages
Lua
1
star
69

docker-archlinux-bootstrap

Arch Linux Docker base image that is generated from the official bootstrap
Shell
1
star
70

epaper-frame

A color e-ink-based picture frame driven by an ESP32
Rust
1
star
71

docker-phoronix-test-suite

A Docker image featuring a ready-to-run installation of the Phoronix Test Suite
Dockerfile
1
star
72

uni-projekt

Projekt Mikrocomputer
Python
1
star
73

rust-python-experiment

Experiments involving Rust + Python
Rust
1
star
74

strapon

Modern C++14 helpers for game stuff
C++
1
star