Fast pseudorandom number generator.
- Optimized for speed.
- Performance scales on multiple CPUs.
It abuses sync.Pool for maintaining "per-CPU" pseudorandom number generators.
TODO: firgure out how to use real per-CPU pseudorandom number generators.
$ GOMAXPROCS=1 go test -bench=. github.com/valyala/fastrand
goos: linux
goarch: amd64
pkg: github.com/valyala/fastrand
BenchmarkUint32n 50000000 29.7 ns/op
BenchmarkRNGUint32n 200000000 6.50 ns/op
BenchmarkRNGUint32nWithLock 100000000 21.5 ns/op
BenchmarkMathRandInt31n 50000000 31.8 ns/op
BenchmarkMathRandRNGInt31n 100000000 17.9 ns/op
BenchmarkMathRandRNGInt31nWithLock 50000000 30.2 ns/op
PASS
ok github.com/valyala/fastrand 10.634s
$ GOMAXPROCS=2 go test -bench=. github.com/valyala/fastrand
goos: linux
goarch: amd64
pkg: github.com/valyala/fastrand
BenchmarkUint32n-2 100000000 17.6 ns/op
BenchmarkRNGUint32n-2 500000000 3.36 ns/op
BenchmarkRNGUint32nWithLock-2 50000000 32.0 ns/op
BenchmarkMathRandInt31n-2 20000000 51.2 ns/op
BenchmarkMathRandRNGInt31n-2 100000000 11.0 ns/op
BenchmarkMathRandRNGInt31nWithLock-2 20000000 91.0 ns/op
PASS
ok github.com/valyala/fastrand 9.543s
$ GOMAXPROCS=4 go test -bench=. github.com/valyala/fastrand
goos: linux
goarch: amd64
pkg: github.com/valyala/fastrand
BenchmarkUint32n-4 100000000 14.2 ns/op
BenchmarkRNGUint32n-4 500000000 3.30 ns/op
BenchmarkRNGUint32nWithLock-4 20000000 88.7 ns/op
BenchmarkMathRandInt31n-4 10000000 145 ns/op
BenchmarkMathRandRNGInt31n-4 200000000 8.35 ns/op
BenchmarkMathRandRNGInt31nWithLock-4 20000000 102 ns/op
PASS
ok github.com/valyala/fastrand 11.534s
As you can see, fastrand.Uint32n
scales on multiple CPUs, while rand.Int31n
doesn't scale. Their performance is comparable on GOMAXPROCS=1
,
but fastrand.Uint32n
runs 3x faster than rand.Int31n
on GOMAXPROCS=2
and 10x faster than rand.Int31n
on GOMAXPROCS=4
.