• Stars
    star
    239
  • Rank 168,763 (Top 4 %)
  • Language
    C++
  • License
    MIT License
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Blazing fast memory allocator designed for video games

smmalloc

Actions Status Build status codecov MIT

About

smmalloc is a fast and efficient "proxy" allocator designed to handle many small allocations/deallocations in heavy multithreaded scenarios. The allocator created for using in applications where the performance is critical such as video games. Designed to speed up the typical memory allocation pattern in C++ such as many small allocations and deletions. This is a proxy allocator means that smmalloc handles only specific allocation sizes and pass-through all other allocations to generic heap allocator.

Commercial games using smmalloc

  • Warface (PS4, X1, Nintendo Switch)

Features

  • Near zero size overhead for handled allocations (allocator does not store any per-allocation meta information)
  • Blazing fast (~40 cycles per operation) when allocator thread cache is used (configurable)
  • Very simple to integrate on any platform. Just make your current allocator as fallback allocator and let the smmalloc handle specific allocations and speed up your application.
  • Highly scalable in multithread environment.

Performance

Here is an example of performance comparison for several different allocators.
Platform: Windows 10, Intel Core i7-2600

Threads #1 #2 #3 #4 #5
crt 23853865 15410410 15993655 14124607 14636381
rpmalloc 75866414 52689298 52606215 46058909 38706739
hoard 65922032 46605339 42874516 34404618 27629651
ltalloc 62525965 52315981 41634992 33557726 27333887
smmalloc 92615384 70584046 66352324 47087501 38303161
smmalloc (no thread cache) 49295774 25991465 11342809 8615216 6455889
dlmalloc + mutex 31313394 5858632 3824636 3354672 2135141
dlmalloc 106304079 0 0 0 0

Performance comparison

Here is an example of performance comparison for several different allocators.
Platform: Playstation 4

Threads #1 #2 #3 #4
mspace 4741379 956729 457264 366920
crt 4444444 853385 419009 332095
ltalloc 28571429 25290698 19248174 14683637
smmalloc 36065574 29333333 25202412 21868691
smmalloc (no thread cache) 22916667 8527132 5631815 4198497
dlmalloc + mutex 8058608 1848623 579845 564604
dlmalloc 35483871 0 0 0

Performance comparison

Usage

_sm_allocator_create - create allocator instance
_sm_allocator_destroy - destroy allocator instance
_sm_allocator_thread_cache_create - create thread cache for current thread
_sm_allocator_thread_cache_destroy - destroy thread cache for current thread
_sm_malloc - allocate aligned memory block
_sm_free - free memory block
_sm_realloc - reallocate memory block
_sm_msize - get usable memory size

Tiny code example

// create allocator to handle 16, 32, 48 and 64 allocations (4 buckets, 16Mb each) 
sm_allocator space = _sm_allocator_create(4, (16 * 1024 * 1024));

// allocate 19 bytes with 16 bytes alignment
void* p = _sm_malloc(space, 19, 16);

// free memory
_sm_free(space, p)

// destroy allocator
_sm_allocator_destroy(space);

More Repositories

1

TaskScheduler

Cross-platform, fiber-based, multi-threaded task scheduler designed for video games.
C++
557
star
2

ArcadeCarPhysics

Arcade Car Physics - Vehicle Simulation for Unity3D
C#
327
star
3

slot_map

A slot map is a high-performance associative container with persistent unique 32/64 bit keys to access stored values.
C++
246
star
4

ecs

Thoughts about entity-component-system
C++
183
star
5

Zmeya

Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications
C++
102
star
6

Goofy

Goofy - Realtime DXT1/ETC1 encoder
C++
70
star
7

Quaternions-Revisited

Sample code for a 'Quaternions revisited' article from GPU Pro 5
C++
31
star
8

Sandworm

Embeddable preprocessor based on clang. Created to compile ubershaders, but can preprocess anything you like.
C++
20
star
9

GpuZen2

Sample code for the article 'Real-Time Layered Materials Compositing Using Spatial Clustering Encoding'
C#
17
star
10

z8

Z8 : fantasy 8-bit system
JavaScript
11
star
11

Ninja-Ripper-Maya-Importer-

Ninja Ripper Importer for Autodesk Maya
Python
11
star
12

tony-mc-mapface-fuse

A cool-headed display transform - ported to DaVinci Resolve
10
star
13

FastFiberIdea

C++
9
star
14

StaticVector

C++
5
star
15

sm_hash_map

C++
5
star
16

AimExplorer

Aim Acceleration Explorer in FPS games
HTML
4
star
17

DxtSplitter

C++
3
star
18

RobloxAvatarExporter

Python
3
star
19

FixMultimonWindows

C++
3
star
20

CellularAutomata

GPU Cellular Automata fun
C++
3
star
21

ECalc

Expression Calculator (Similar to Far Manager plugin)
HTML
2
star
22

DominantColorsExtractor

Python
2
star
23

Sputnik

Python
2
star
24

GitCheatSheet

2
star
25

BuildConfigurationTool

C#
1
star
26

AT-Linker

C++
1
star
27

sergeymakeev.github.io

github pages
JavaScript
1
star
28

ExcaliburHash

C++
1
star
29

XMQ_Client

Example of XMQ client.
C++
1
star
30

MOD_XMQ

Custom extension for XMPP protocol to provide message broker features.
Erlang
1
star