• Stars
    star
    23
  • Rank 1,016,462 (Top 21 %)
  • Language
    C++
  • License
    MIT License
  • Created almost 9 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Infrastructure for simultaneous orbital and attitude propagation, with attitude-based real-time analytical aerodynamics simulation

More Repositories

1

CRC

Fastest CRC32 for x86, Intel and AMD, + comprehensive derivation and discussion of various approaches
C++
219
star
2

RGB2Y

Fastest CPU (AVX/SSE) RGB to grayscale: 2-4x faster than OpenCV. For image processing/computer vision.
C++
89
star
3

KFAST

Implementation of FAST feature detector for computer vision (Rosten 2006) using AVX2 to outperform canonical implementation by up to 600%.
C
74
star
4

SortingNetworks

Fastest CPU SIMD (SSE4) sorting networks for small integer arrays (2-6 elements), also optimal amd64 assembly and notes on getting compilers to generate optimal sorting networks.
Assembly
42
star
5

KORAL

Novel extreme-performance CPU-GPU cooperative feature detector-descriptor for computer vision.
C++
38
star
6

FastArrayOps

Extremely fast x86 / AVX2 assembly implementations of common operations for linear arrays: checking whether array contains element, finding index of element, finding min/max element, finding index of min/max element.
Assembly
36
star
7

LATCH

Fastest CPU implementation of the LATCH 512-bit binary feature descriptor; fully scale- and rotation-invariant
C++
34
star
8

CLATCH

Insanely fast CUDA LATCH: fully scale- and rotation-invariant 512-bit binary descriptor for computer vision
C++
32
star
9

CUDAKfNN

Fastest CUDA SIFT or other 128-float vector matcher for computer vision
C++
25
star
10

FastDivide

Divide 64-bit integers faster than hardware. Or precompute for a given denom and quickly divide repeatedly.
C++
22
star
11

KLERP

Fastest CPU (AVX2) Bilinear and Nearest-Neighbor Interpolation: 25-100% faster than OpenCV. For computer vision / image processing.
C++
19
star
12

CUDAK2NN

Insanely fast CUDA 2NN 512-bit binary descriptor matcher for computer vision
C++
14
star
13

CUDARGB2Y

Fastest CUDA RGB to grayscale: 5-30x faster than OpenCV. For image processing/computer vision.
C++
14
star
14

KNES

Complete, lightweight NES emulator in C++, speedcoded in 3 days.
C++
14
star
15

KfNN

Fastest CPU (AVX/SSE) SIFT or other 128-float vector matcher for computer vision
C++
13
star
16

CUDALERP

Fast CUDA (GPU) Bilinear and Nearest-Neighbor Interpolation at high accuracy - uint8_t data
C++
12
star
17

BoxBlur

Fastest CPU (AVX/SSE) Horizontal Box Blur for image processing and computer vision
C++
10
star
18

K2NN

Fast bruteforce and Multi-Index Hash (MIH) accelerated 2NN matchers for 512-bit binary descriptors for computer vision
C++
10
star
19

CUDAHammingMean

Fastest GPU implementation of a brute-force Hamming-weight matrix sum/mean for 512-bit binary descriptors.
C++
9
star
20

ULATCH

Fastest CPU implementation of the LATCH 512-bit binary feature descriptor for computer vision (upright)
C++
9
star
21

CUDAFLERP

Fast CUDA (GPU) Bilinear and Nearest-Neighbor Interpolation at high accuracy - float32 data
C++
9
star
22

FastThreadPool

Fast lock-free thread pool
C++
8
star
23

UCLATCH

Insanely fast CUDA LATCH 512-bit binary descriptor for computer vision (upright)
C++
8
star
24

FastIntegerSqrt

Fastest implementations of 32-bit and 64-bit integer square roots for x86-64
C++
7
star
25

FeatureAngle

Extremely fast SSE gradient (angle of rotation) computation of grayscale features in an image, for image processing and computer vision.
C++
7
star
26

popcount

Fastest possible x86 implementation of popcount/population count/Hamming weight/counting set bits
C++
6
star
27

BitOps

Basic, efficient, header-only bit ops and bit array primitives for modern x86. Tests provided.
C++
6
star
28

MATLAB-KDrag

Orbital and attitude propagator with B-dot and *dynamic* aerodynamic drag simulation, including torque computation for aero-stabilized bodies.
MATLAB
6
star
29

CUDAKfNN_packed

Fastest CUDA SIFT or other 128-float *packed as uint8_t* vector matcher for computer vision
C++
5
star
30

EllipticCurveFactorization

Fast, single-file, MIT-licensed large integer factorization using ECM combined with other techniques.
C++
5
star
31

PyCruiseControl

Modified divorced PID controller applied to car cruise control and accompanying physics simulation and visualizations
Python
5
star
32

ArduinoPhysics

Realtime 2D physics and collision detection on an Arduino with 60 fps output to a Sharp memory LCD.
C++
5
star
33

MemoryOrder

Demos of 3 ways even the strong memory model of x86 can exhibit architectural memory reordering, leading to bugs
C++
5
star
34

PrimeSieve

Super fast, dynamically expanding prime sieve for primality queries, forward or backward iteration
C++
4
star
35

ModularSqrt

Fast modular square root of primes and prime powers, including 2. Interface uses GMP bigints.
C++
4
star
36

KFAST_OpenMVG

Custom version of KFAST for integration into OpenMVG
C++
4
star
37

smart_tm

a smart, leap-second- and leap-day-aware, fast, 64-bit-capable replacement for the ctime 'tm' struct
C++
3
star
38

KHALF

Optimized special-case bilinear interpolation, halving the width and not changing the height, for computer vision dual-frame display.
C++
3
star
39

MATLABCruiseControl

Modified divorced PID controller applied to car cruise control and accompanying physics simulation and visualizations - MATLAB port
MATLAB
3
star
40

Factorization-Primality

Extremely fast, single-file factorization and primality testing for 32-bit and 64-bit integers on x86.
C++
3
star
41

SMC-Demo

Minimal demo of self-modifying code on Windows. Still doable, still useful.
Assembly
3
star
42

UnsignedIntegralToFloatingPoint

Notes on fast standards-compliant conversion of U32/U64 to and from float/double, which compilers do not get right.
3
star
43

SingleLinePythonSudoku

Single-line Python Sudoku solver
2
star
44

Boids_SDL

Numerical simulation of flocking behavior using pure CPU and SDL.
C++
2
star
45

Sudoku

Fast sudoku solver with detection of no solution/single solution/multiple solutions/invalid initial board
C++
2
star
46

SolveModularQuadratic

Generate all solutions to a modular quadratic equation. Supports any modulus. Interface uses GMP bigints.
C++
2
star
47

CudaBoids

Numerical simulation of flocking behavior using CUDA and OpenGL
Cuda
2
star
48

Schematic

Basic toy Lisp interpreter in a few hundred lines of C++.
C++
2
star
49

Leftpack

Fast AVX2 leftpack/compress implementations (keep and contiguously pack a subset of elements)
C++
1
star
50

U128

Fast unsigned 128-bit integer class for MSVC since it doesn't natively support __uint128_t yet
C++
1
star
51

FastDivide128

Getting __udivti3 or __umodti3 errors? Just want faster division/modulo for 128-bit ints on Clang? Look no further.
C++
1
star