• Stars
    star
    631
  • Rank 71,222 (Top 2 %)
  • Language
    C
  • License
    Other
  • Created over 7 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The system call intercepting library

syscall_intercept

Build Status Coverage Status Coverity Scan Build Status

Userspace syscall intercepting library.

Dependencies

Runtime dependencies

  • libcapstone -- the disassembly engine used under the hood

Build dependencies

Local build dependencies

  • C99 toolchain -- tested with recent versions of GCC and clang
  • cmake
  • perl -- for checking coding style
  • pandoc -- for generating the man page

Travis CI build dependencies

The travis builds use some scripts to generate a docker images, in which syscall_intercept is built/tested. These docker images are pushed to Dockerhub, to be reused in later travis builds. The scripts expect four environment variables to be set in the travis environment:

  • DOCKERHUB_REPO - where to store the docker images used for building e.g. in order to refer to a Dockerhub repository at https://hub.docker.com/r/pmem/syscall_intercept, this variable should contain the string "pmem/syscall_intercept"
  • DOCKERHUB_USER - used for logging into Dockerhub
  • DOCKERHUB_PASSWORD - used for logging into Dockerhub
  • GITHUB_REPO - where the repository is available on github (e.g. "pmem/syscall_intercept" )

How to build

Building libsyscall_intercept requires cmake. Example:

cmake path_to_syscall_intercept -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang
make

alternatively:

ccmake path_to_syscall_intercept
make

There is an install target. For now, all it does, is cp.

make install

Coming soon:

make test

Synopsis

#include <libsyscall_intercept_hook_point.h>
cc -lsyscall_intercept -fpic -shared source.c -o preloadlib.so

LD_PRELOAD=preloadlib.so ./application
Description:

The system call intercepting library provides a low-level interface for hooking Linux system calls in user space. This is achieved by hotpatching the machine code of the standard C library in the memory of a process. The user of this library can provide the functionality of almost any syscall in user space, using the very simple API specified in the libsyscall_intercept_hook_point.h header file:

int (*intercept_hook_point)(long syscall_number,
			long arg0, long arg1,
			long arg2, long arg3,
			long arg4, long arg5,
			long *result);

The user of the library shall assign to the variable called intercept_hook_point a pointer to the address of a callback function. A non-zero return value returned by the callback function is used to signal to the intercepting library that the specific system call was ignored by the user and the original syscall should be executed. A zero return value signals that the user takes over the system call. In this case, the result of the system call (the value stored in the RAX register after the system call) can be set via the *result pointer. In order to use the library, the intercepting code is expected to be loaded using the LD_PRELOAD feature provided by the system loader.

All syscalls issued by libc are intercepted. Syscalls made by code outside libc are not intercepted. In order to be able to issue syscalls that are not intercepted, a convenience function is provided by the library:

long syscall_no_intercept(long syscall_number, ...);

Three environment variables control the operation of the library:

INTERCEPT_LOG -- when set, the library logs each syscall intercepted to a file. If it ends with "-" the path of the file is formed by appending a process id to the value provided in the environment variable. E.g.: initializing the library in a process with pid 123 when the INTERCEPT_LOG is set to "intercept.log-" will result in a log file named intercept.log-123.

*INTERCEPT_LOG_TRUNC -- when set to 0, the log file from INTERCEPT_LOG is not truncated.

INTERCEPT_HOOK_CMDLINE_FILTER -- when set, the library checks the command line used to start the program. Hotpatching, and syscall intercepting is only done, if the last component of the command used to start the program is the same as the string provided in the environment variable. This can also be queried by the user of the library:

int syscall_hook_in_process_allowed(void);
Example:
#include <libsyscall_intercept_hook_point.h>
#include <syscall.h>
#include <errno.h>

static int
hook(long syscall_number,
			long arg0, long arg1,
			long arg2, long arg3,
			long arg4, long arg5,
			long *result)
{
	if (syscall_number == SYS_getdents) {
		/*
		 * Prevent the application from
		 * using the getdents syscall. From
		 * the point of view of the calling
		 * process, it is as if the kernel
		 * would return the ENOTSUP error
		 * code from the syscall.
		 */
		*result = -ENOTSUP;
		return 0;
	} else {
		/*
		 * Ignore any other syscalls
		 * i.e.: pass them on to the kernel
		 * as would normally happen.
		 */
		return 1;
	}
}

static __attribute__((constructor)) void
init(void)
{
	// Set up the callback function
	intercept_hook_point = hook;
}
$ cc example.c -lsyscall_intercept -fpic -shared -o example.so
$ LD_LIBRARY_PATH=. LD_PRELOAD=example.so ls
ls: reading directory '.': Operation not supported

Under the hood:

Assumptions:

In order to handle syscalls in user space, the library relies on the following assumptions:

  • Each syscall made by the applicaton is issued via libc
  • No other facility attempts to hotpatch libc in the same process
  • The libc implementation is already loaded in the processes memory space when the intercepting library is being initialized
  • The machine code in the libc implementation is suitable for the methods listed in this section
  • For some more basic assumptions, see the section on limitations.
Disassembly:

The library disassembles the text segment of the libc loaded into the memory space of the process it is initialized in. It locates all syscall instructions, and replaces each of them with a jump to a unique address. Since the syscall instruction of the x86_64 ISA occupies only two bytes, the method involves locating other bytes close to the syscall suitable for overwriting. The destination of the jump (unique for each syscall) is a small routine, which accomplishes the following tasks:

  1. Optionally executes any instruction that originally preceded the syscall instruction, and was overwritten to make space for the jump instruction
  2. Saves the current state of all registers to the stack
  3. Translates the arguments (in the registers) from the Linux x86_64 syscall calling convention to the C ABI's calling convention used on x86_64
  4. Calls a function written in C (which in turn calls the callback supplied by the library user)
  5. Loads the values from the stack back into the registers
  6. Jumps back to libc, to the instruction following the overwritten part
In action:

Simple hotpatching: Replace a mov and a syscall instruction with a jmp instruction

Before:                         After:

db2a0 <__open>:                 db2b0 <__open>:
db2aa: mov $2, %eax           /-db2aa: jmp e0000
db2af: syscall                |
db2b1: cmp $-4095, %rax       | db2b1: cmp $-4095, %rax ---\
db2b7: jae db2ea              | db2b7: jae db2ea           |
db2b9: retq                   | db2b9: retq                |
                              | ...                        |
                              | ...                        |
                              \_...                        |
                                e0000: mov $2, $eax        |
                                ...                        |
                                e0100: call implementation /
                                ...                       /
                                e0200: jmp db2aa ________/

Hotpatching using a trampoline jump: Replace a syscall instruction with a short jmp instruction, the destination of which is a regular jmp instruction. The reason to use this, is that a short jmp instruction consumes only two bytes, thus fits in the place of a syscall instruction. Sometimes the instructions directly preceding or following the syscall instruction can not be overwritten, leaving only the two bytes of the syscall instruction for patching. The hotpatching library looks for place for the trampoline jump in the padding found to the end of each routine. Since the start of all routines is aligned to 16 bytes, often there is a padding space between the end of a symbol, and the start of the next symbol. In the example below, this padding is filled with 7 byte long nop instruction (so the next symbol can start at the address 3f410).

Before:                         After:

3f3fe: mov %rdi, %rbx           3f3fe: mov %rdi, %rbx
3f401: syscall                /-3f401: jmp 3f430
3f403: jmp 3f415              | 3f403: jmp 3f415 ----------\
3f407: retq                   | 3f407: retq                |
                              \                            |
3f408: nopl 0x0(%rax,%rax,1)  /-3f408: jmp e1000           |
                              | ...                        |
                              | ...                        |
                              \_...                        |
                                e1000: nop                 |
                                ...                        |
                                e1100: call implementation /
                                ...                       /
                                e1200: jmp 3f403 ________/

Limitations:

  • Only Linux is supported
  • Only x86_64 is supported
  • Only tested with glibc, although perhaps it works with some other libc implementations as well
  • There are known issues with the following syscalls:
    • clone
    • rt_sigreturn

Debugging:

Besides logging, the most important factor during debugging is to make sure the syscalls in the debugger are not intercepted. To achieve this, use the INTERCEPT_HOOK_CMDLINE_FILTER variable described above.

INTERCEPT_HOOK_CMDLINE_FILTER=ls \
	LD_PRELOAD=libsyscall_intercept.so \
	gdb ls

With this filtering, the intercepting library is not activated in the gdb process itself.

More Repositories

1

pmdk

Persistent Memory Development Kit
C
1,336
star
2

pmemkv

Key/Value Datastore for Persistent Memory
C++
397
star
3

ndctl

A "device memory" enabling project encompassing tools and libraries for CXL, NVDIMMs, DAX, memory tiering and other platform memory device topics.
C
262
star
4

pcj

Persistent Collections for Java
Java
221
star
5

kvdk

Key Value Development Kit
C++
201
star
6

pmem-redis

A version of Redis that uses persistent memory
C
113
star
7

valgrind

Enhanced Valgrind for Persistent Memory
C
107
star
8

libpmemobj-cpp

C++ bindings & containers for libpmemobj
C++
107
star
9

rpma

Remote Persistent Memory Access Library
C
101
star
10

vltrace

Tool tracing syscalls in a fast way using eBPF linux kernel feature
C
98
star
11

llpl

Low Level Persistence Library
Java
97
star
12

pmem-rocksdb

A version of RocksDB that uses persistent memory
C++
90
star
13

linux-examples

Early (now outdated) examples. Use PMDK instead.
C
59
star
14

run_qemu

A script to create bootable OS images, and run qemu with a locally built kernel.
Shell
57
star
15

pmdk-examples

PMDK examples and tutorials
C++
57
star
16

book

Persistent Memory Programming book examples
C
39
star
17

vmemcache

Buffer based LRU cache
C
35
star
18

pmemfile

Userspace implementation of file APIs using persistent memory.
C
34
star
19

pmemkv-java

Java bindings for pmemkv
Java
28
star
20

pmse

Persistent Memory Storage Engine
C++
24
star
21

vmem

Volatile Persistent Memory Allocator
C
23
star
22

pmemkv-bench

Benchmarking tools for pmemkv
C++
22
star
23

pmem.github.io

The pmem.io Website
HTML
17
star
24

pmemkv-python

Python bindings for pmemkv
Python
13
star
25

issues

Old issues repo for PMDK.
13
star
26

pmdk-tests

Extended tests for PMDK libraries and utilities
C++
10
star
27

miniasync

C
10
star
28

docs

Persistent Memory Docbook
9
star
29

pmemstream

C++
9
star
30

libpmemobj-js

JavaScript bindings for libpmemobj
C++
8
star
31

pmemkv-nodejs

NodeJS bindings for pmemkv
JavaScript
8
star
32

pmem-rocksdb-plugin

RocksDB plugin for optimized PMem support
C++
5
star
33

mpi-pmem-ext

MPI Extensions for Persistent Memory
C
4
star
34

kvm-redis

Recipe to run a memtier benchmark on a cluster of KVM-hosted Redis servers
Jinja
4
star
35

pmemkv-jni

Java Native Interface for pmemkv
C++
3
star
36

pmul

PMUL is a Java library that adds PMem programming features to Javaโ€™s foreign memory API in JDK 18
Java
2
star
37

acpi-spec-ecr

ACPI Specification ECRs
Makefile
2
star
38

dev-utils-kit

Shell
2
star
39

pmemkv-ruby

Ruby bindings for pmemkv
Ruby
2
star
40

autoflushtest

Basic data integrity test for platforms with flush-on-fail CPU caches
C
1
star
41

pmdk-convert

Conversion tool for pmdk pools
CMake
1
star
42

knowledge-base

Knowledge Base for pmem.io
SCSS
1
star