AVX / AVX2 Intrinsics Example Code
Quick Start
Compile
$ make
All the source files in src/
will be compiled and generate binary files to the bin/
in each subdirectory.
Run
Fast compile & run at one time!
Execute this command:
$ make run
At the project root directory, then you can see all the program output on your terminal
Clean
It's easy to clean all the output files, just enter the following command at the project root directory:
$ make clean
Then you would find out all the output files are gone away!
Initialization Intrinsics
Initialization with Scalar Values
Loading Data to Memory
Arithmetic Intrinsics
Addition and Subtraction
-
- _mm256_add_ps
- _mm256_add_pd
- _mm256_add_epi8
AVX2
- _mm256_add_epi16
AVX2
- _mm256_add_epi32
AVX2
- _mm256_add_epi64
AVX2
-
- _mm256_sub_ps
- _mm256_sub_pd
- _mm256_sub_epi8
AVX2
- _mm256_sub_epi16
AVX2
- _mm256_sub_epi32
AVX2
- _mm256_sub_epi64
AVX2
-
- _mm256_adds_epi8
AVX2
- _mm256_adds_epi16
AVX2
- _mm256_adds_epu8
AVX2
- _mm256_adds_epu16
AVX2
- _mm256_adds_epi8
-
- _mm256_subs_epi8
AVX2
- _mm256_subs_epi16
AVX2
- _mm256_subs_epu8
AVX2
- _mm256_subs_epu16
AVX2
- _mm256_subs_epi8
-
-
-
- _mm256_hadds_epi16
AVX2
- _mm256_hadds_epi16
-
- _mm256_hsubs_epi16
AVX2
- _mm256_hsubs_epi16
-
Multiplication and Division
-
-
- _mm256_mullo_epi16
AVX2
- _mm256_mullo_epi32
AVX2
- _mm256_mullo_epi16
-
- _mm256_mulhi_epi16
AVX2
- _mm256_mulhi_epu16
AVX2
- _mm256_mulhi_epi16
-
- _mm256_mulhrs_epi16
AVX2
- _mm256_mulhrs_epi16
-
Fused Multiply and Add (FMA)
-
- _mm_fmadd_ps
FMA
- _mm_fmadd_pd
FMA
- _mm256_fmadd_ps
FMA
- _mm256_fmadd_pd
FMA
- _mm_fmadd_ss
FMA
- _mm_fmadd_sd
FMA
- _mm_fmadd_ps
-
- _mm_fmsub_ps
FMA
- _mm_fmsub_pd
FMA
- _mm256_fmsub_ps
FMA
- _mm256_fmsub_pd
FMA
- _mm_fmsub_ss
FMA
- _mm_fmsub_sd
FMA
- _mm_fmsub_ps
-
- _mm_fnmadd_ps
FMA
- _mm_fnmadd_pd
FMA
- _mm256_fnmadd_ps
FMA
- _mm256_fnmadd_pd
FMA
- _mm_fnmadd_ss
FMA
- _mm_fnmadd_sd
FMA
- _mm_fnmadd_ps
-
- _mm_fnmsub_ps
FMA
- _mm_fnmsub_pd
FMA
- _mm256_fnmsub_ps
FMA
- _mm256_fnmsub_pd
FMA
- _mm_fnmsub_ss
FMA
- _mm_fnmsub_sd
FMA
- _mm_fnmsub_ps
-
- _mm_fmaddsub_ps
FMA
- _mm_fmaddsub_pd
FMA
- _mm256_fmaddsub_ps
FMA
- _mm256_fmaddsub_pd
FMA
- _mm_fmaddsub_ps
-
- _mm_fmsubadd_ps
FMA
- _mm_fmsubadd_pd
FMA
- _mm256_fmsubadd_ps
FMA
- _mm256_fmsubadd_pd
FMA
- _mm_fmsubadd_ps
Permuting and Shuffling
Permuting
Copyright
This project is licensed under the BSD 3-Clause license.