regehr/ub-canaries

Miscellaneous

Stars
172
Rank 221,201 (Top 5 %)
Language
C
License
MIT License
Created over 9 years ago
Updated almost 6 years ago

regehr/ub-canaries

regehr

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

collection of C/C++ programs that try to get compilers to exploit undefined behavior

-------------------------------------------------------------------------------

UB Canaries: A collection of C/C++ programs that detect undefined
behavior exploitation by compilers.

-------------------------------------------------------------------------------

To run all tests, type:

  run-canaries

For a complete list of command line options:

  run-canaries --help

-------------------------------------------------------------------------------

Each directory documents and tests an expectation: something a developer might
-- perhaps unreasonably! -- expect the compiler to do when faced with undefined
behavior. For example, the "uninitialized-variable" directory tests the
expectation that an uninitialized scalar variable will consistently return some
value that is legal for the variable's type.

If any test program in a directory does not display the expected behavior (for a
given compiler version + flags), then that compiler is considered to violate
that expectation.

The toplevel script run-canaries tests the expectations for a
specified collection of compilers and compiler flags. For now, you
need to edit that file to change the set of compilers and flags.

The output of run-canaries is, for each compiler / flag / canary
combination, a bit that tells us whether that compiler has been
observed to exploit that particular UB (in other words, whether it has
been observed violating the expectation). So here, for example, clang
has been seen to exploit signed integer overflows and uninitialized
variables, but not signed left shifts:

clang -O3 signed-left-shift 0
clang -O3 signed-integer-overflow 1
clang -O3 uninitialized-variable 1

-------------------------------------------------------------------------------

Guidelines for writing tests:

- Each test program should be entirely contained (except for standard header
  files) in a single compilation unit.

- An expectation should only be tested by looking at a program's stdout, never
  by looking at its assembly code or observing its memory usage or execution
  time.

- A test program foo.c may have one or more outputs corresponding to the
  "expected" case where the compiler does not exploit that UB. If there is one
  such file it should be called foo.output. If there are multiple files they
  should be called foo.output1, foo.output2, etc. If the actual output does not
  match any of these files, the compiler is assumed to have exploited the UB.

- Every test program must test only a single UB. In other words, each test
  program is written in a dialect of C that is completely standard except that a
  single behavior (signed integer overflow or whatever) is actually defined
  instead of undefined.

- Reliance on implementation-defined behavior is unavoidable, but please avoid
  gratuitous reliance such as assuming a particular size for int or long. It is
  OK to use the fixed-width types such as int32_t.

- A tricky issue to how much to expose to the optimizer and how much to hide.
  There are no particular good rules of thumb that I am aware of, you just have
  to try different things.

- An easier issue is *how* to hide from the optimizer. I suggest introducing a
  dependency on argc or on the value loaded from a volatile. Tests may assume
  that argc == 1.

-------------------------------------------------------------------------------

itc-benchmarks

static analysis benchmarks from Toyota ITC

compiler-crashes

60 artisanal compiler crashes

opt-fuzz

llvm opt fuzzer and bounded exhaustive test generator

nibble-sort

Many functions in C for sorting the nibbles in an 8-byte word

pldi22-llvm-tutorial

outline and links for PLDI 2022 tutorial

rb_tree_demo

code accompanying a blog post about fuzzing a red-black tree implementation: http://blog.regehr.org/archives/896

llvm-dataflow-info

print information from LLVM dataflow analyses

sudo-1.8.13

sudo for compiler bug demo

guided-tree-search

heuristically and dynamically sample (more) uniformly from large decision trees of unknown shape

solid_code_class

shared files for U of Utah CS 5959: Writing Solid Code

str2long_contest

code corresponding to a coding contest posted here: http://blog.regehr.org/archives/909

const_time

empirical measurement of code constructs that seem like they should have constant execution time regardless of values of inputs

llvm-pass-template

minimal out-of-tree LLVM pass

fs-fuzz

two simple fuzzers, one for UNIX filesystem operations, the other for C streams

random-testing-book

rc4-poc

proof of concept for local OpenSSL RC4 buffer overrun bug

cost-model

looking into devising good cost models for optimizing LLVM IR

optimizer-eval

compiler optimizer evaluation

python_rb_tree_demo

demonstration of fuzzing a red-black tree in Python for a blog entry

isolating-a-miscompilation

code accompanying a blog post about a compiler bug

sudoku

brute-force sudoku solver in C

llvm-dataflow-research

experimenting with better interval transfer functions for LLVM

advanced_os_class

shared repo for U of Utah CS 5962 spring 2014

assert_quiz

A quick quiz about how assertions should be used.

parse-arm

llvm-test-lvi

LLVM pass for testing the soundness of the LazyValueInfo analysis pass

calc-compiler

tiny language compiler for class

llvm-mc-info

just playing with llvm-mca

llvm-stress

fork of llvm-stress

cs6015-webscrapers

web scraper assignment for Utah CS 6015 Spring 2018

knownbits-compare

cs6960-fall17

repo for advanced OS, fall 2017, U of Utah