• Stars
    star
    216
  • Rank 183,179 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

POSIX equivalent of Windows DLL import libraries

License Build Status Total alerts Codecov

Motivation

In a nutshell, Implib.so is a simple equivalent of Windows DLL import libraries for POSIX shared libraries.

On Linux/Android, if you link against shared library you normally use -lxyz compiler option which makes your application depend on libxyz.so. This would cause libxyz.so to be forcedly loaded at program startup (and its constructors to be executed) even if you never call any of its functions.

If you instead want to delay loading of libxyz.so (e.g. its unlikely to be used and you don't want to waste resources on it or slow down startup time or you want to select best platform-specific implementation at runtime), you can remove dependency from LDFLAGS and issue dlopen call manually. But this would cause ld to err because it won't be able to statically resolve symbols which are supposed to come from this shared library. At this point you have only two choices:

  • emit normal calls to library functions and suppress link errors from ld via -Wl,-z,nodefs; this is undesired because you loose ability to detect link errors for other libraries statically
  • load necessary function addresses at runtime via dlsym and call them via function pointers; this isn't very convenient because you have to keep track which symbols your program uses, properly cast function types and also somehow manage global function pointers

Implib.so provides an easy solution - link your program with a wrapper which

  • provides all necessary symbols to make linker happy
  • loads wrapped library on first call to any of its functions
  • redirects calls to library symbols

Generated wrapper code (often also called "shim" code or "shim" library) is analogous to Windows import libraries which achieve the same functionality for DLLs.

Implib.so can also be used to reduce API provided by existing shared library or rename it's exported symbols.

Implib.so was originally inspired by Stackoverflow question Is there an elegant way to avoid dlsym when using dlopen in C?.

Usage

A typical use-case would look like this:

$ implib-gen.py libxyz.so

This will generate code for host platform (presumably x86_64). For other targets do

$ implib-gen.py --target $TARGET libxyz.so

where TARGET can be any of

  • x86_64-linux-gnu, x86_64-none-linux-android
  • i686-linux-gnu, i686-none-linux-android
  • arm-linux-gnueabi, armel-linux-gnueabi, armv7-none-linux-androideabi
  • arm-linux-gnueabihf (ARM hardfp ABI)
  • aarch64-linux-gnu, aarch64-none-linux-android
  • mipsel-linux-gnu
  • mips64el-linux-gnuabi64
  • e2k-linux-gnu

Script generates two files: libxyz.so.tramp.S and libxyz.so.init.c which need to be linked to your application (instead of -lxyz):

$ gcc myfile1.c myfile2.c ... libxyz.so.tramp.S libxyz.so.init.c ... -ldl

Note that you need to link against libdl.so. On ARM in case your app is compiled to Thumb code (which e.g. Ubuntu's arm-linux-gnueabihf-gcc does by default) you'll also need to add -mthumb-interwork.

Application can then freely call functions from libxyz.so without linking to it. Library will be loaded (via dlopen) on first call to any of its functions. If you want to forcedly resolve all symbols (e.g. if you want to avoid delays further on) you can call void libxyz_init_all().

Above command would perform a lazy load i.e. load library on first call to one of it's symbols. If you want to load it at startup, run

$ implib-gen.py --no-lazy-load libxyz.so

If you don't want dlopen to be called automatically and prefer to load library yourself at program startup, run script as

$ implib-gen.py --no-dlopen libxys.so

If you do want to load library via dlopen but would prefer to call it yourself (e.g. with custom parameters or with modified library name), run script as

$ cat mycallback.c
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

#ifdef __cplusplus
extern "C"
#endif

// Callback that tries different library names
void *mycallback(const char *lib_name) {
  lib_name = lib_name;  // Please the compiler
  void *h;
  h = dlopen("libxyz.so", RTLD_LAZY);
  if (h)
    return h;
  h = dlopen("libxyz-stub.so", RTLD_LAZY);
  if (h)
    return h;
  fprintf(stderr, "dlopen failed: %s\n", dlerror());
  exit(1);
}

$ implib-gen.py --dlopen-callback=mycallback libxyz.so

(callback must have signature void *(*)(const char *lib_name) and return handle of loaded library).

Normally symbols are located via dlsym function but this can be overriden with custom callback by using --dlsym-callback (which must have signature void *(*)(void *handle, const char *sym_name)).

Finally to force library load and resolution of all symbols, call

void _LIBNAME_tramp_resolve_all(void);

Wrapping vtables

By default the tool does not try to wrap vtables exported from the library. This can be enabled via --vtables flag:

$ implib-gen.py --vtables ...

Reducing external interface of closed-source library

Sometimes you may want to reduce public interface of existing shared library (e.g. if it's a third-party lib which erroneously exports too many unrelated symbols).

To achieve this you can generate a wrapper with limited number of symbols and override the callback which loads the library to use dlmopen instead of dlopen (and thus does not pollute the global namespace):

$ cat mysymbols.txt
foo
bar
$ cat mycallback.c
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

#ifdef __cplusplus
extern "C"
#endif

// Dlopen callback that loads library to dedicated namespace
void *mycallback(const char *lib_name) {
  void *h = dlmopen(LM_ID_NEWLM, lib_name, RTLD_LAZY | RTLD_DEEPBIND);
  if (h)
    return h;
  fprintf(stderr, "dlmopen failed: %s\n", dlerror());
  exit(1);
}

$ implib-gen.py --dlopen-callback=mycallback --symbol-list=mysymbols.txt libxyz.so
$ ... # Link your app with libxyz.tramp.S, libxyz.init.c and mycallback.c

Similar approach can be used if you want to provide a common interface for several libraries with partially intersecting interfaces (see this example for more details).

Renaming exported interface of closed-source library

Sometimes you may need to rename API of existing shared library to avoid name clashes.

To achieve this you can generate a wrapper with renamed symbols which call to old, non-renamed symbols in original library loaded via dlmopen instead of dlopen (to avoid polluting global namespace):

$ cat mycallback.c
... Same as before ...
$ implib-gen.py --dlopen-callback=mycallback --symbol_prefix=MYPREFIX_ libxyz.so
$ ... # Link your app with libxyz.tramp.S, libxyz.init.c and mycallback.c

Linker wrapper

Generation of wrappers may be automated via linker wrapper scripts/ld. Adding it to PATH (in front of normal ld) would by default result in all dynamic libs (besides system ones) to be replaced with wrappers. Explicit list of libraries can be specified by exporting IMPLIBSO_LD_OPTIONS environment variable:

export IMPLIBSO_LD_OPTIONS='--wrap-libs attr,acl'

For more details run with

export IMPLIBSO_LD_OPTIONS=--help

Atm linker wrapper is only meant for testing.

Overhead

Implib.so overhead on a fast path boils down to

  • predictable direct jump to wrapper
  • load from trampoline table
  • predictable untaken direct branch to initialization code
  • predictable indirect jump to real function

This is very similar to normal shlib call:

  • predictable direct jump to PLT stub
  • load from GOT
  • predictable indirect jump to real function

so it should have equivalent performance.

Limitations

The tool does not transparently support all features of POSIX shared libraries. In particular

  • it can not provide wrappers for data symbols (except C++ virtual/RTTI tables)
  • it makes first call to wrapped functions asynch signal unsafe (as it will call dlopen and library constructors)
  • it may change semantics if there are multiple definitions of same symbol in different loaded shared objects (runtime symbol interposition is considered a bad practice though)
  • it may change semantics because shared library constructors are delayed until when library is loaded

The tool also lacks the following important features:

  • proper support for multi-threading
  • symbol versions are not handled at all
  • support OSX and RISC-V
  • keep fast paths of shims together to reduce I$ pressure (none should be hard to add so let me know if you need it).

Finally, there are some minor TODOs in code.

Related work

As mentioned in introduction import libraries are first class citizens on Windows platform:

Delay-loaded libraries were once present on OSX (via -lazy_lXXX and -lazy_library options).

Lazy loading is supported by Solaris shared libraries but was never implemented in Linux. There have been some discussions in libc-alpha but no patches were posted.

Implib.so-like functionality is used in OpenGL loading libraries e.g. GLEW via custom project-specific scripts.

More Repositories

1

libdebugme

Automatically spawn gdb on error.
C
65
star
2

python-hate

A growing list of things I dislike about Python
49
star
3

SymbolHider

A tool which hides symbols exported from shared libraries or relocatable object files
C
37
star
4

ShlibVisibilityChecker

Tool for locating internal symbols unnecessarily exported from shared libraries.
Python
35
star
5

sortcheck

Tool for detecting violations of ordering axioms in qsort/bsearch callbacks.
C
35
star
6

sighandlercheck

Proof-of-concept tool for checking signal handlers for reentrancy issues.
C
17
star
7

maintainer-scripts

A bunch of useful scripts for toolchain/distro maintenance.
Shell
17
star
8

valgrind-preload

LD_PRELOAD-able library which runs all spawned processes under Valgrind.
C
15
star
9

sortcheckxx

Tool for detecting violations of ordering axioms in STL comparators
C++
14
star
10

DirtyFrame

A prototype tool to provoke uninitilized data errors by filling stack frames with garbage in prologue
Python
9
star
11

primogen

A toy prime number generator in Verilog
Verilog
9
star
12

scripts-and-dotfiles

Bashrc and friends.
Shell
9
star
13

FlakyIterators

A fast and dirty checker based on libclang which detects non-deterministic iteration
C++
8
star
14

mgt-notes

Various notes on team management (for personal use)
8
star
15

gcc-interp

Run C files as standalone scripts
Python
8
star
16

DirtyPad

Clang plugin which fills structure pads to provoke buffer overflow errors
C++
7
star
17

debian_pkg_test

Scripts to apply code analyzers to Debian packages.
Shell
7
star
18

gaplan

A simple but functional toolset for constructing and analyzing Gaperton's (aka Vlad Balin's) declarative plans
Python
7
star
19

Localizer

A simple tool to find functions which can be made static
Python
6
star
20

uInit

Instructions on obtaining stable benchmarks results on modern Linux distro
Shell
6
star
21

pkupk

Easy-to-use cmdline wrapper for various archivers (.tar*, .zip, .deb, .rpm, etc.).
Shell
5
star
22

failing-malloc

A simple 5-minute checker which simulates OOM failures by returning NULL from malloc
C
5
star
23

SchoolTracker

Simple tool to track Moscow schools on map
Python
5
star
24

ld-limiter

Limit number of parallel link jobs
Shell
4
star
25

Lalambda

Slides for Lalambda school
Coq
4
star
26

parmatch

A simple script for finding unbound parameters in Verilog module instantiations.
Perl
3
star
27

sudoku

A simple Sudoku solver that I've done to experiment with SAT/SMT solvers.
Haskell
3
star
28

InterposeChecker

Experimental project to locate symbol interpositions in Debian packages
Python
3
star
29

gcov-demo

A simple demo of gcov usage in different modes
Shell
3
star
30

seflasher

Simple serial port flasher
Python
2
star
31

question-58541216

Temp repo for experiments with https://stackoverflow.com/questions/58541216/transitive-symbol-visiblity-in-c
C++
2
star
32

films

A list of recommended films
2
star
33

ShlibVisibilityChecker-test

Tests for yugr/ShlibVisibilityChecker-action project
C
2
star
34

gatecheck

Yet another Verilog static analyzer
Perl
2
star
35

CppRussia

Slides for CppRussia conference
C
1
star