No Sane Compiler Would Optimize Atomics
Abstract
False.
Compilers do optimize atomics, memory accesses around atomics, and utilize architecture-specific knowledge. My hobby is to encourage compilers to do more of this, programmers to rely on it, and hardware vendors to give us new atomic toys to optimize with. Oh, and standardize yet more close-to-the-metal concurrency and parallelism tools.
But, you say, surely volatile always means volatile, there’s nothing wrong with my benign races, nothing could even go wrong with non-temporal accesses, and who needs 6 memory orderings anyways‽ I’m glad you asked, let me tell you about my hobby…
Talk Details
This talk is based on a paper I wrote for the C++ standards committee,
no sane compiler would optimize atomics (abstract:
false
). The title and abstract are such perfectly flippant clickbait, yet you
won't believe how there's actual content hiding behind the snark!
View the presentation: jfbastien.github.io/no-sane-compiler (press S for speaker notes, use ← and → to navigate backward / forward).
Video
The talk was given at CppCon 2016 and is available on YouTube.
It was previously given at C++Now 2016, is also on YouTube but the recording isn't great.
References
A (non-comprehensive) list of references that went into creating this talk.
- No sane compiler would optimize atomics
- When should compilers optimize atomics?
- Agner's Software optimization resources
- Can Seqlocks Get Along With Programming Language Memory Models?
- C/C++11 mappings to processors
- Mathematizing C++ Concurrency
- Threads Cannot be Implemented as a Library
- Common Compiler Optimisations are Invalid in the C11 Memory Model and what we can do about it
- ARM Barrier Litmus Tests and Cookbook
- DR 476 (C volatile)
- CVE-2015-8550 paravirtualized drivers incautious about shared memory contents
- Compiler-Introduced Double-Fetch Vulnerabilities – Understanding XSA-155
- GCC Xtensa Options
- MSVC volatile (C++)
- Volatile: Almost Useless for Multi-Threaded Programming
- Should volatile Acquire Atomicity and Thread Visibility Semantics?
- Acquire and Release Fences
- LKML Memory corruption due to word sharing (Linus Torvalds)
- LKML Memory corruption due to word sharing (Hans Boehm)
- JVM pipeline blog
- Memory Model for Multithreaded C++
- Atomicity and Visibility in Tiny Embedded Systems
- Programming with Threads: Questions Frequently Asked by C and C++ Programmers
- Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
- GCC Built-in Functions for Memory Model Aware Atomic Operations
- GCC Legacy __sync Built-in Functions for Atomic Memory Access
- libc++ atomic
- compiler-rt atomic
- clang CGAtomic
- LLVM atomics
Meta
Built using reveal.js.