• Stars
    star
    585
  • Rank 76,419 (Top 2 %)
  • Language
    C
  • License
    MIT License
  • Created almost 9 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Speed-up over 50% in average vs traditional memcpy in gcc 4.9 or vc2012

Build SSE

with gcc:

gcc -O3 -msse2 FastMemcpy.c -o FastMemcpy

with msvc:

cl -nologo -arch:SSE2 -O2 FastMemcpy.c

Build AVX

with gcc:

gcc -O3 -mavx FastMemcpy_Avx.c -o FastMemcpy_Avx

with msvc:

cl -nologo -arch:AVX -O2 FastMemcpy_Avx.c

Features

  • 50% speedup in avg. vs traditional memcpy in msvc 2012 or gcc 4.9
  • small size copy optimized with jump table
  • medium size copy optimized with sse2 vector copy
  • huge size copy optimized with cache prefetch & movntdq

Reference

Using Block Prefetch for Optimized Memory Performance

The artical only focused on aligned huge memory copy. You need handle other cases by your self.

Results

result: gcc4.9 (msvc 2012 got a similar result):
 
benchmark(size=32 bytes, times=16777216):
result(dst aligned, src aligned): memcpy_fast=81ms memcpy=281 ms
result(dst aligned, src unalign): memcpy_fast=88ms memcpy=254 ms
result(dst unalign, src aligned): memcpy_fast=87ms memcpy=245 ms
result(dst unalign, src unalign): memcpy_fast=81ms memcpy=258 ms

benchmark(size=64 bytes, times=16777216):
result(dst aligned, src aligned): memcpy_fast=91ms memcpy=364 ms
result(dst aligned, src unalign): memcpy_fast=95ms memcpy=336 ms
result(dst unalign, src aligned): memcpy_fast=96ms memcpy=353 ms
result(dst unalign, src unalign): memcpy_fast=99ms memcpy=346 ms

benchmark(size=512 bytes, times=8388608):
result(dst aligned, src aligned): memcpy_fast=124ms memcpy=242 ms
result(dst aligned, src unalign): memcpy_fast=166ms memcpy=555 ms
result(dst unalign, src aligned): memcpy_fast=168ms memcpy=602 ms
result(dst unalign, src unalign): memcpy_fast=174ms memcpy=614 ms

benchmark(size=1024 bytes, times=4194304):
result(dst aligned, src aligned): memcpy_fast=119ms memcpy=171 ms
result(dst aligned, src unalign): memcpy_fast=182ms memcpy=442 ms
result(dst unalign, src aligned): memcpy_fast=163ms memcpy=466 ms
result(dst unalign, src unalign): memcpy_fast=168ms memcpy=472 ms

benchmark(size=4096 bytes, times=524288):
result(dst aligned, src aligned): memcpy_fast=68ms memcpy=82 ms
result(dst aligned, src unalign): memcpy_fast=94ms memcpy=226 ms
result(dst unalign, src aligned): memcpy_fast=134ms memcpy=216 ms
result(dst unalign, src unalign): memcpy_fast=84ms memcpy=188 ms

benchmark(size=8192 bytes, times=262144):
result(dst aligned, src aligned): memcpy_fast=55ms memcpy=70 ms
result(dst aligned, src unalign): memcpy_fast=75ms memcpy=192 ms
result(dst unalign, src aligned): memcpy_fast=79ms memcpy=223 ms
result(dst unalign, src unalign): memcpy_fast=91ms memcpy=219 ms

benchmark(size=1048576 bytes, times=2048):
result(dst aligned, src aligned): memcpy_fast=181ms memcpy=165 ms
result(dst aligned, src unalign): memcpy_fast=192ms memcpy=303 ms
result(dst unalign, src aligned): memcpy_fast=218ms memcpy=310 ms
result(dst unalign, src unalign): memcpy_fast=183ms memcpy=307 ms

benchmark(size=4194304 bytes, times=512):
result(dst aligned, src aligned): memcpy_fast=263ms memcpy=398 ms
result(dst aligned, src unalign): memcpy_fast=269ms memcpy=433 ms
result(dst unalign, src aligned): memcpy_fast=306ms memcpy=497 ms
result(dst unalign, src unalign): memcpy_fast=285ms memcpy=417 ms

benchmark(size=8388608 bytes, times=256):
result(dst aligned, src aligned): memcpy_fast=287ms memcpy=421 ms
result(dst aligned, src unalign): memcpy_fast=288ms memcpy=430 ms
result(dst unalign, src aligned): memcpy_fast=285ms memcpy=510 ms
result(dst unalign, src unalign): memcpy_fast=291ms memcpy=440 ms

benchmark random access:
memcpy_fast=487ms memcpy=1000ms

About

skywind

http://www.skywind.me

More Repositories

1

kcp

โšก KCP - A Fast and Reliable ARQ Protocol
C
15,270
star
2

awesome-cheatsheets

่ถ…็บง้€ŸๆŸฅ่กจ - ็ผ–็จ‹่ฏญ่จ€ใ€ๆก†ๆžถๅ’Œๅผ€ๅ‘ๅทฅๅ…ท็š„้€ŸๆŸฅ่กจ๏ผŒๅ•ไธชๆ–‡ไปถๅŒ…ๅซไธ€ๅˆ‡ไฝ ้œ€่ฆ็Ÿฅ้“็š„ไธœ่ฅฟ โšก
Shell
11,094
star
3

ECDICT

Free English to Chinese Dictionary Database
Python
5,922
star
4

preserve-cd

Game Preservation Project
3,654
star
5

z.lua

โšก A new cd command that helps you navigate faster by learning your habits.
Lua
2,979
star
6

mini3d

3D Software Renderer in 700 Lines !!
C
2,173
star
7

asyncrun.vim

๐Ÿš€ Run Async Shell Commands in Vim 8.0 / NeoVim and Output to the Quickfix Window !!
Vim Script
1,852
star
8

RenderHelp

โšก ๅฏ็ผ–็จ‹ๆธฒๆŸ“็ฎก็บฟๅฎž็Žฐ๏ผŒๅธฎๅŠฉๅˆๅญฆ่€…ๅญฆไน ๆธฒๆŸ“
C++
1,333
star
9

vim-quickui

The missing UI extensions for Vim 9 (and NeoVim) !! ๐Ÿ˜Ž
Vim Script
1,094
star
10

vim

Personal Vim Profile
Vim Script
911
star
11

asynctasks.vim

๐Ÿš€ Modern Task System for Project Building, Testing and Deploying !!
Vim Script
910
star
12

vim-init

่ฝป้‡็บง Vim ้…็ฝฎๆก†ๆžถ๏ผŒๅ…จไธญๆ–‡ๆณจ้‡Š
Vim Script
907
star
13

emake

ไฝ ่ง่ฟ‡็š„ๆœ€็ฎ€ๅ•็š„ GCC/CLANG ้กน็›ฎๆž„ๅปบๅทฅๅ…ท๏ผŒๅฎšไน‰ๅผๆž„ๅปบ๏ผŒๆฏ”ๅ‘ฝไปคๅผๆ›ด็ฎ€ๅ•
Python
802
star
14

PyStand

๐Ÿš€ Python Standalone Deploy Environment !!
C++
736
star
15

preserve-iso

็ป็‰ˆ่ฝฏไปถไฟๆŠคๅทฅ็จ‹
580
star
16

avlmini

AVL implementation which is as fast/compact as linux's rbtree
C
347
star
17

quickmenu.vim

A nice customizable popup menu for vim
Vim Script
275
star
18

vim-auto-popmenu

๐Ÿ˜Ž Display the Completion Menu Automantically (next AutoComplPop) !!
Vim Script
271
star
19

gutentags_plus

The right way to use gtags with gutentags
Vim Script
266
star
20

vim-terminal-help

Small changes make vim/nvim's internal terminal great again !!
Vim Script
243
star
21

translator

ๅ‘ฝไปค่กŒ่šๅˆ็ฟป่ฏ‘ๅทฅๅ…ท๏ผŒๆ”ฏๆŒ่ฐทๆญŒ๏ผŒๅฟ…ๅบ”๏ผŒๆœ‰้“๏ผŒ็™พๅบฆ๏ผŒ่ฏ้œธ๏ผŒ360
Python
227
star
22

ECDICT-ultimate

Ultimate ECDICT Database
219
star
23

GONGLUE

ๅ•ๆœบๆธธๆˆๆ”ป็•ฅ็ง˜็ฑ๏ผˆ1580+ ็ฏ‡๏ผ‰
Python
180
star
24

vim-preview

The missing preview window for vim
Vim Script
167
star
25

pixellib

High Quality 2D Graphics Library
C
157
star
26

KanaQuiz

Hiragana/Katakana Speed Reading Quiz in Command Line !! ๐Ÿ˜Ž
Python
147
star
27

images

Static Page
C++
144
star
28

BasicBitmap

Simple and high-performance and platform independent Bitmap class (34% faster than GDI/GDI+, 40% faster than DDraw)
C++
131
star
29

AsyncNet

AsyncNet
C
117
star
30

gobang

Gobang game with artificial intelligence in 900 Lines !!
Python
115
star
31

vim-rt-format

๐Ÿ˜Ž Prettify Current Line on Enter !!
Vim Script
113
star
32

vim-keysound

๐Ÿท Play typewriter sound in Vim when you are typing a letter
Vim Script
112
star
33

Intel2GAS

Convert MSVC Style Inline Assembly to GCC Style Inline Assembly
Python
103
star
34

CloudClip

Your own clipboard in the cloud, copy and paste text with gist between systems !!
Python
79
star
35

googauth

The Python Command-line Reimplementaion of Google Authenticator
Python
74
star
36

LIBLR

Parser Generator for LR(1) and LALR
Python
68
star
37

markpress

Write WordPress in Markdown in Your Favorite Text Editor !! ๐Ÿ˜Ž ๐Ÿ˜Ž
Python
67
star
38

vim-dict

ๆฒกๅŠžๆณ•๏ผŒ่ขซ้€ผ็š„๏ผŒ้‡ๆ–ฐๆ•ด็†ไธ€ไธช่ฏๅ…ธ่กฅๅ…จ็š„ๆ•ฐๆฎๅบ“
Vim Script
56
star
39

terminal

Open Terminal Window to execute command in Windows / Cygwin / Ubuntu / OS X
Python
51
star
40

LeaderF-snippet

Intuitive Way to Use Snippet
Vim Script
46
star
41

nanolib

Cross-Platform Networking Library
C
44
star
42

vim-gpt-commit

๐Ÿš€ Generate git commit message using ChatGPT in Vim (and NeoVim) !!
Python
43
star
43

czmod

๐Ÿš€ Native Module Written in C to Boost z.lua !!
C
42
star
44

collection

ๆฒกๅœฐๆ–นๆ”พ็š„ไปฃ็ ๏ผŒๆ‡’ๅพ—ๅผ€ๆ–ฐ้กน็›ฎไบ†๏ผŒๆ”พ่ฟ™้‡Œๅงใ€‚
Python
40
star
45

atom-shell-commands

Execute user defined shell commands (looking for new maintainers)
JavaScript
36
star
46

vim-navigator

๐Ÿš€ Navigate Your Commands Easily !!
Vim Script
32
star
47

lemma.en

English Lemma Database - Compiled by Referencing British National Corpus
29
star
48

ml

Machine Learning From Scratch
C
28
star
49

memslab

Slab Memory Allocator in Application Layer
C
28
star
50

asyncrun.extra

Extra runners for asyncrun to run your command in Tmux/Gnome-terminal panel, xterm, Floaterm and more.
Vim Script
27
star
51

vim-color-patch

๐ŸŒˆ Load colorscheme patch script automatically !!
Vim Script
25
star
52

zvi

๐Ÿš€ Smallest Vi-clone Text Editor for Windows CLI and SSH session (only 62KB) !!
23
star
53

vim-color-export

๐ŸŒˆ A tool to backport NeoVim colorschemes to Vim !!
Vim Script
20
star
54

QuickNet

UDP Networking Library
C
19
star
55

asmpure

Asmpure is a library written in C for compiling assembly code at run-time
C
16
star
56

docker

Docker Images
Python
16
star
57

VmBasic

ๅŸบไบŽ่™šๆ‹Ÿๆœบ็š„ไปฟ QuickBasic ่ฏญ่จ€
C++
15
star
58

vim-cppman

Read Cppman/Man pages right inside your vim.
Vim Script
15
star
59

language

Language Collection
Python
12
star
60

tcz_cd

Autojump for Total Commander !!
Python
11
star
61

LanguageMark

Native Language Benchmark in Numerous Algorithms
C
9
star
62

abandonware

Abandonware Collection
9
star
63

rogue-clone

A fork of rogue-clone with bug fixes and improvements.
C
8
star
64

vim-proposal

Collection of Proposals for Vim
TypeScript
7
star
65

gosub

Golang Sub-routines for Network Development
Go
7
star
66

winxp-editors

๐Ÿท Text Editors Preservation Project for Windows XP+
Batchfile
7
star
67

cannon

Cross Platform Network Framework
C
6
star
68

shell-scripts

ๅธธ็”จ็š„ๅ‘ฝไปค่กŒ่„šๆœฌๅˆ้›†๏ผŒ่ฎฉไฝ ๆฏๅคฉ็š„ๅ‘ฝไปค่กŒ็”Ÿๆดปๆ›ดๅŠ ้ซ˜ๆ•ˆ
Shell
6
star
69

crtzero

Zero Dependent on CRT (libc)
C
6
star
70

pyp2p

Python P2P Framework
Python
6
star
71

SimdVector

Cross Platform SIMD Vector Math In A Single Header File (SimdVector.h)
C++
5
star
72

support

Win32 Command Line Tools for Development
Python
5
star
73

treasure

Single-file MIT Licensed C/C++ Portable Libraries
C
4
star
74

asyncredis

Async Redis Client for Python
Python
3
star
75

skywind

Personal Blog
HTML
3
star
76

directx9-samples

samples
C++
3
star
77

gfx

Just Another Toy yet !!
C++
3
star
78

script

Script I am using
Python
3
star
79

colors-from-neovim.vim

๐ŸŒˆ Backported NeoVim Colors for Vim
Vim Script
3
star
80

ones

One single file MIT licensed C/C++ Libraries
2
star
81

asclib

Basic Java Network Lib
Java
2
star
82

transmod

Automatically exported from code.google.com/p/transmod
C
2
star
83

toys

My PyQt Desktop Toys
Python
2
star
84

rust

Rust Learning Repository
1
star
85

vile

Vile the vi-clone text editor
C
1
star
86

xvi

A portable multi-file text editor and the smallest full-function vi clone
C
1
star
87

emacs

Personal Emacs Profile
Emacs Lisp
1
star
88

cmake-scratch

Cmake Templates
CMake
1
star