• Stars
    star
    652
  • Rank 69,062 (Top 2 %)
  • Language
  • License
    GNU General Publi...
  • Created almost 12 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Kernel patch enables compiler optimizations for additional CPUs.

kernel_compiler_patch

This patch adds additional optimization/tuning for kernel builds by adding more micro-architectures options accessible under:

 Processor type and features  --->
 Processor family --->

Why a specific patch?

The kernel uses its own set of CFLAGS, KCFLAGS. For example, see:

Alternative way to define a -march= option without this patch

As pointed out by codemac in this topic, one can simply export the value/values for the KCFLAGS and KCPPFLAGS before calling make to achieve the same result, see here.

export KCFLAGS=' -march=znver3 -mtune=znver3'
export KCPPFLAGS=' -march=znver3 -mtune=znver3'
make all

Expanded CPUs include

CPU Family -march= Min GCC Ver Min Clang Ver
Native optimizations autodetected by GCC native 4.2 3.8
Generic 64-bit level v2 x86-64-v2 11.1 12.0
Generic 64-bit level v3 x86-64-v3 11.1 12.0
Generic 64-bit level v4 x86-64-v4 11.1 12.0
AMD Improved K8-family k8-sse3 9.3 9.0
AMD K10-family amdfam10 9.3 9.0
AMD Family 10h (Barcelona) barcelona 9.3 9.0
AMD Family 14h (Bobcat) btver1 9.3 9.0
AMD Family 16h (Jaguar) btver2 9.3 9.0
AMD Family 15h (Bulldozer) bdver1 9.3 9.0
AMD Family 15h (Piledriver) bdver2 9.3 9.0
AMD Family 15h (Steamroller) bdver3 9.3 9.0
AMD Family 15h (Excavator) bdver4 9.3 9.0
AMD Family 17h (Zen) znver1 9.3 9.0
AMD Family 17h (Zen 2) znver2 9.3 9.0
AMD Family 19h (Zen 3) znver3 10.3 12.0
AMD Family 19h (Zen 4) znver4 13.0 ???
Intel Bonnell family Atom bonnell 9.3 9.0
Intel Silvermont family Atom silvermont 9.3 9.0
Intel Goldmont family Atom (Apollo Lake and Denverton) goldmont 9.3 9.0
Intel Goldmont Plus family Atom (Gemini Lake) goldmont-plus 9.3 9.0
Intel 1st Gen Core i3/i5/i7-family (Nehalem) nehalem 9.3 9.0
Intel 1.5 Gen Core i3/i5/i7-family (Westmere) westmere 9.3 9.0
Intel 2nd Gen Core i3/i5/i7-family (Sandybridge) sandybridge 9.3 9.0
Intel 3rd Gen Core i3/i5/i7-family (Ivybridge) ivybridge 9.3 9.0
Intel 4th Gen Core i3/i5/i7-family (Haswell) haswell 9.3 9.0
Intel 5th Gen Core i3/i5/i7-family (Broadwell) broadwell 9.3 9.0
Intel 6th Gen Core i3/i5/i7-family (Skylake) skylake 9.3 9.0
Intel 6th Gen Core i7/i9-family (Skylake X) skylake-avx512 9.3 9.0
Intel 8th Gen Core i3/i5/i7-family (Cannon Lake) cannonlake 9.3 9.0
Intel 10th Gen Core i7/i9-family (Ice Lake) icelake-client 9.3 9.0
Intel Xeon (Cascade Lake) cascadelake 10.2 10.0
Intel Xeon (Cooper Lake) cooperlake 10.2 10.0
Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake) cooperlake 10.2 10.0
Intel 4th Gen 10nm++ Xeon (Sapphire Rapids) sapphirerapids 11.1 12.0
Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake) rocketlake 11.1 12.0
Intel 12th Gen i3/i5/i7/i9-family (Alder Lake) alderlake 11.1 12.0
Intel 13th Gen i3/i5/i7/i9-family (Raptor Lake) raptorlake 13.0 15.0.5
Intel 5th Gen 10nm++ Xeon (Emerald Rapids) emeraldrapids 13.0 ???

Benchmarks

Intro

Three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options were tested using a make based endpoint.

Conclusion

There are small but real speed increases to running with this patch as judged by a make endpoint. The increases are on par with the speed increase that the upstream sanctioned core2 option gives users, so not including additional options seems somewhat arbitrary to me.

Details

  1. Three test machines: Intel Xeon X3360, Intel i7-2620M, Intel Core i7-3660K.
  2. All ran the make benchmark (linked below) 35 times while booted into a 'generic' kernel. Then all ran the same make benchmark 35 times after booting into an optimized kernel. Below are the optimizations chosen for each machine.
    • X3360 = core2
    • i7-2620M = sandybridge
    • i7-3660K = ivybridge
  3. Results were analyzed for statistical significance via ANOVA plots that clearly show statistically significant albeit small differences.

Discussion

  1. All the assumptions for ANOVA are met:
    • Data are normally distributed as show in the normal quantile plots.
    • The population variances are fairly equal (Levene and Barlett tests).
  2. The ANOVA plots clearly show significance.
    • Pair-wise analysis by Tukey-Kramer shows significance at the 0.05 level for all CPUs compared.

Below are the differences in median values:

CPU Difference in median value
core2 +87.5 ms
sandybridge +79.7 ms
ivybridge +257.2 ms

References

Credit

Legacy support

Find support for older version of the linux kernel and of gcc in the outdated_versions directory.

Data

Sandybridge vs. Generic

2620_m

Ivybridge vs. Generic

3770_k

Core2 vs. Generic

x3360

More Repositories

1

profile-sync-daemon

Symlinks and syncs browser profile dirs to RAM thus reducing HDD/SDD calls and speeding-up browsers.
Shell
899
star
2

xscreensaver-aerial

xscreensaver that randomly selects one of the Apple TV4 aerial movies
Shell
426
star
3

anything-sync-daemon

Symlinks and syncs user specified dirs to RAM thus reducing HDD/SDD calls and speeding-up the system.
Shell
345
star
4

pulseaudio-ctl

Control pulseaudio volume from the shell or mapped to keyboard shortcuts. No need for alsa-utils.
Shell
275
star
5

lostfiles

Simple script that identifies files not owned by any Arch Linux package.
Shell
270
star
6

profile-cleaner

Simple script to vacuum and reindex sqlite databases used by Firefox and by Chrome/Chromium.
Shell
203
star
7

modprobed-db

Keeps track of EVERY kernel module that has ever been probed. Useful for those of us who make localmodconfig :)
Shell
171
star
8

kodi-standalone-service

Use systemd to allow for standalone operation of kodi.
Roff
158
star
9

clean-chroot-manager

Wrapper script for managing clean chroots under Arch Linux
Shell
110
star
10

configs

User and system config files.
Shell
53
star
11

ovpngen

Generate an OpenVPN Connect private tunnel profile in the unified format
Shell
44
star
12

hosts-update

Updates /etc/hosts with the mvps blocklist to prevent thousands of parasites, hijackers and unwanted adware/spyware/privacy websites from working.
Shell
42
star
13

sleepnoise

Turn a Raspberry Pi into a white noise generator
Shell
36
star
14

auto-reencode

Mass convert wmv and flv files to mp4 contained x264 files using ffmpeg.
Shell
19
star
15

adblock-by_haarp

A clean, lean and mean adblocking script for TomatoUSB routers
Shell
19
star
16

bin

Collection of scripts perhaps of utility to others.
Shell
17
star
17

lxc-service-snapshots

Run disposable (read-only then delete) Linux containers (LXC) to serve up OpenVPN (server only), Pi-Hole, or WireGuard.
Shell
17
star
18

newsboat_custom_stuff

Hacky shell scripts to generate RSS feeds on the fly for newsboat
Shell
15
star
19

mplayer-resumer

An mplayer wrapper script that will resume playback of previously-stopped video where you left off. Written by Bob Igo.
Perl
14
star
20

streamzap

Linux Streamzap USB remote config files for lirc, mplayer and kodi
Python
12
star
21

mandb-ondemand

Rebuilds the manpage index database on-demand to speed-up pacman operations.
Makefile
11
star
22

arris-capture

Shell script to log Arris power levels, signal-to-noise ratio and frequencies. Ready to graph in dygraph.
Shell
11
star
23

backdrop-randomizer

Companion for xfdesktop which randomly cycles through wallpapers without repeating.
Shell
9
star
24

distccd-arch-arm

Provides an Arch ARM client with Systemd services/environment files to make use of distcc-alarm naively.
Makefile
8
star
25

kodi-logger

Keeps track of every video you watch on kodi (xbmc).
Shell
8
star
26

raspberrypi-kodi-service

Systemd service unit to run with minimal frame buffer to save GPU memory
7
star
27

kodi-prevent-xscreensaver

Keep xscreensaver from coming on when kodi is active.
Makefile
6
star
28

crosstool-ng_for_distcc

Build your own toolchain with crosstool-ng for Arch ARM volunteers to help build x86_64 stuff
5
star
29

RT-N66U

Miscell. scripts the Asus RT-N66U running TomatoUSB can use.
4
star
30

linux-optimized

more or less the old linux-ck but without ck patches since they are no longer maintained
Shell
4
star
31

odroid-auto-bridge

Simple method to create a network bridge for the ODROID-C1+/C2/XU4.
Makefile
4
star
32

governor-switcher

Systemd units to toggle between ondemand and performance based on time of day.
Makefile
3
star
33

sb6121-capture

Shell script to log SB6121 power levels, signal-to-noise ratio and frequencies.
Shell
2
star
34

getpkg

Alternative to ABS, uses svn to pull Arch Linux PKGBUILD and associated files
Shell
2
star
35

multilame

Pseudo multi-threaded bash script to parallel encode .wav to .mp3 via lame
Shell
2
star
36

odroid-c2-rtc

Setup RTC shield for C2 on Arch ARM
Makefile
2
star
37

makechapterlist

Trivial bash script to make a chapter list file that MP4Box can use.
Shell
2
star
38

buildhelper

helper for building with devtools-alarm on Arch ARM using distcc
Shell
1
star
39

distccd-alarm

1
star
40

kodi-raspberry-pi3

Optimize kodi for the Cortex-A72 processor found on the Raspberry Pi 4
1
star