• Stars
    star
    6,806
  • Rank 5,799 (Top 0.2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 9 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A minimal init system for Linux containers

dumb-init

PyPI version

dumb-init is a simple process supervisor and init system designed to run as PID 1 inside minimal container environments (such as Docker). It is deployed as a small, statically-linked binary written in C.

Lightweight containers have popularized the idea of running a single process or service without normal init systems like systemd or sysvinit. However, omitting an init system often leads to incorrect handling of processes and signals, and can result in problems such as containers which can't be gracefully stopped, or leaking containers which should have been destroyed.

dumb-init enables you to simply prefix your command with dumb-init. It acts as PID 1 and immediately spawns your command as a child process, taking care to properly handle and forward signals as they are received.

Why you need an init system

Normally, when you launch a Docker container, the process you're executing becomes PID 1, giving it the quirks and responsibilities that come with being the init system for the container.

There are two common issues this presents:

  1. In most cases, signals won't be handled properly.

    The Linux kernel applies special signal handling to processes which run as PID 1.

    When processes are sent a signal on a normal Linux system, the kernel will first check for any custom handlers the process has registered for that signal, and otherwise fall back to default behavior (for example, killing the process on SIGTERM).

    However, if the process receiving the signal is PID 1, it gets special treatment by the kernel; if it hasn't registered a handler for the signal, the kernel won't fall back to default behavior, and nothing happens. In other words, if your process doesn't explicitly handle these signals, sending it SIGTERM will have no effect at all.

    A common example is CI jobs that do docker run my-container script: sending SIGTERM to the docker run process will typically kill the docker run command, but leave the container running in the background.

  2. Orphaned zombie processes aren't properly reaped.

    A process becomes a zombie when it exits, and remains a zombie until its parent calls some variation of the wait() system call on it. It remains in the process table as a "defunct" process. Typically, a parent process will call wait() immediately and avoid long-living zombies.

    If a parent exits before its child, the child is "orphaned", and is re-parented under PID 1. The init system is thus responsible for wait()-ing on orphaned zombie processes.

    Of course, most processes won't wait() on random processes that happen to become attached to them, so containers often end with dozens of zombies rooted at PID 1.

What dumb-init does

dumb-init runs as PID 1, acting like a simple init system. It launches a single process and then proxies all received signals to a session rooted at that child process.

Since your actual process is no longer PID 1, when it receives signals from dumb-init, the default signal handlers will be applied, and your process will behave as you would expect. If your process dies, dumb-init will also die, taking care to clean up any other processes that might still remain.

Session behavior

In its default mode, dumb-init establishes a session rooted at the child, and sends signals to the entire process group. This is useful if you have a poorly-behaving child (such as a shell script) which won't normally signal its children before dying.

This can actually be useful outside of Docker containers in regular process supervisors like daemontools or supervisord for supervising shell scripts. Normally, a signal like SIGTERM received by a shell isn't forwarded to subprocesses; instead, only the shell process dies. With dumb-init, you can just write shell scripts with dumb-init in the shebang:

#!/usr/bin/dumb-init /bin/sh
my-web-server &  # launch a process in the background
my-other-server  # launch another process in the foreground

Ordinarily, a SIGTERM sent to the shell would kill the shell but leave those processes running (both the background and foreground!). With dumb-init, your subprocesses will receive the same signals your shell does.

If you'd like for signals to only be sent to the direct child, you can run with the --single-child argument, or set the environment variable DUMB_INIT_SETSID=0 when running dumb-init. In this mode, dumb-init is completely transparent; you can even string multiple together (like dumb-init dumb-init echo 'oh, hi').

Signal rewriting

dumb-init allows rewriting incoming signals before proxying them. This is useful in cases where you have a Docker supervisor (like Mesos or Kubernetes) which always sends a standard signal (e.g. SIGTERM). Some apps require a different stop signal in order to do graceful cleanup.

For example, to rewrite the signal SIGTERM (number 15) to SIGQUIT (number 3), just add --rewrite 15:3 on the command line.

To drop a signal entirely, you can rewrite it to the special number 0.

Signal rewriting special case

When running in setsid mode, it is not sufficient to forward SIGTSTP/SIGTTIN/SIGTTOU in most cases, since if the process has not added a custom signal handler for these signals, then the kernel will not apply default signal handling behavior (which would be suspending the process) since it is a member of an orphaned process group. For this reason, we set default rewrites to SIGSTOP from those three signals. You can opt out of this behavior by rewriting the signals back to their original values, if desired.

One caveat with this feature: for job control signals (SIGTSTP, SIGTTIN, SIGTTOU), dumb-init will always suspend itself after receiving the signal, even if you rewrite it to something else.

Installing inside Docker containers

You have a few options for using dumb-init:

Option 1: Installing from your distro's package repositories (Debian, Ubuntu, etc.)

Many popular Linux distributions (including Debian (since stretch) and Debian derivatives such as Ubuntu (since bionic)) now contain dumb-init packages in their official repositories.

On Debian-based distributions, you can run apt install dumb-init to install dumb-init, just like you'd install any other package.

Note: Most distro-provided versions of dumb-init are not statically-linked, unlike the versions we provide (see the other options below). This is normally perfectly fine, but means that these versions of dumb-init generally won't work when copied to other Linux distros, unlike the statically-linked versions we provide.

Option 2: Installing via an internal apt server (Debian/Ubuntu)

If you have an internal apt server, uploading the .deb to your server is the recommended way to use dumb-init. In your Dockerfiles, you can simply apt install dumb-init and it will be available.

Debian packages are available from the GitHub Releases tab, or you can run make builddeb yourself.

Option 3: Installing the .deb package manually (Debian/Ubuntu)

If you don't have an internal apt server, you can use dpkg -i to install the .deb package. You can choose how you get the .deb onto your container (mounting a directory or wget-ing it are some options).

One possibility is with the following commands in your Dockerfile:

RUN wget https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_amd64.deb
RUN dpkg -i dumb-init_*.deb

Option 4: Downloading the binary directly

Since dumb-init is released as a statically-linked binary, you can usually just plop it into your images. Here's an example of doing that in a Dockerfile:

RUN wget -O /usr/local/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64
RUN chmod +x /usr/local/bin/dumb-init

Option 5: Installing from PyPI

Though dumb-init is written entirely in C, we also provide a Python package which compiles and installs the binary. It can be installed from PyPI using pip. You'll want to first install a C compiler (on Debian/Ubuntu, apt-get install gcc is sufficient), then just pip install dumb-init.

As of 1.2.0, the package at PyPI is available as a pre-built wheel archive and does not need to be compiled on common Linux distributions.

Usage

Once installed inside your Docker container, simply prefix your commands with dumb-init (and make sure that you're using the recommended JSON syntax).

Within a Dockerfile, it's a good practice to use dumb-init as your container's entrypoint. An "entrypoint" is a partial command that gets prepended to your CMD instruction, making it a great fit for dumb-init:

# Runs "/usr/bin/dumb-init -- /my/script --with --args"
ENTRYPOINT ["/usr/bin/dumb-init", "--"]

# or if you use --rewrite or other cli flags
# ENTRYPOINT ["dumb-init", "--rewrite", "2:3", "--"]

CMD ["/my/script", "--with", "--args"]

If you declare an entrypoint in a base image, any images that descend from it don't need to also declare dumb-init. They can just set a CMD as usual.

For interactive one-off usage, you can just prepend it manually:

$ docker run my_container dumb-init python -c 'while True: pass'

Running this same command without dumb-init would result in being unable to stop the container without SIGKILL, but with dumb-init, you can send it more humane signals like SIGTERM.

It's important that you use the JSON syntax for CMD and ENTRYPOINT. Otherwise, Docker invokes a shell to run your command, resulting in the shell as PID 1 instead of dumb-init.

Using a shell for pre-start hooks

Often containers want to do some pre-start work which can't be done during build time. For example, you might want to template out some config files based on environment variables.

The best way to integrate that with dumb-init is like this:

ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["bash", "-c", "do-some-pre-start-thing && exec my-server"]

By still using dumb-init as the entrypoint, you always have a proper init system in place.

The exec portion of the bash command is important because it replaces the bash process with your server, so that the shell only exists momentarily at start.

Building dumb-init

Building the dumb-init binary requires a working compiler and libc headers and defaults to glibc.

$ make

Building with musl

Statically compiled dumb-init is over 700KB due to glibc, but musl is now an option. On Debian/Ubuntu apt-get install musl-tools to install the source and wrappers, then just:

$ CC=musl-gcc make

When statically compiled with musl the binary size is around 20KB.

Building the Debian package

We use the standard Debian conventions for specifying build dependencies (look in debian/control). An easy way to get started is to apt-get install build-essential devscripts equivs, and then sudo mk-build-deps -i --remove to install all of the missing build dependencies automatically. You can then use make builddeb to build dumb-init Debian packages.

If you prefer an automated Debian package build using Docker, just run make builddeb-docker. This is easier, but requires you to have Docker running on your machine.

See also

More Repositories

1

elastalert

Easy & Flexible Alerting With ElasticSearch
Python
7,926
star
2

detect-secrets

An enterprise friendly way of detecting and preventing secrets in code.
Python
3,704
star
3

mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
Python
2,615
star
4

osxcollector

A forensic evidence collection & analysis toolkit for OS X
Python
1,858
star
5

paasta

An open, distributed platform as a service
Python
1,681
star
6

undebt

A fast, straightforward, reliable tool for performing massive, automated code refactoring
Python
1,634
star
7

MOE

A global, black box optimization engine for real world metric optimization.
C++
1,306
star
8

dockersh

A shell which places users into individual docker containers
Go
1,282
star
9

dataset-examples

Samples for users of the Yelp Academic Dataset
Python
1,189
star
10

yelp.github.io

A showcase of projects we've open sourced and open source projects we use
JavaScript
701
star
11

bravado

Bravado is a python client library for Swagger 2.0 services
Python
603
star
12

yelp-api

Examples of code using our v2 API
PHP
580
star
13

service-principles

A guide to service principles at Yelp for our service oriented architecture
423
star
14

swagger-gradle-codegen

πŸ’« A Gradle Plugin to generate your networking code from Swagger
Kotlin
413
star
15

mysql_streamer

MySQLStreamer is a database change data capture and publish system.
Python
409
star
16

pyleus

Pyleus is a Python framework for developing and launching Storm topologies.
Python
406
star
17

yelp-fusion

Yelp Fusion API
Python
401
star
18

docker-custodian

Keep docker hosts tidy
Python
355
star
19

android-school

The best videos from the Android community and beyond
350
star
20

Tron

Next generation batch process scheduling and management
Python
340
star
21

kafka-utils

Python
313
star
22

bento

DEPRECATED - A delicious framework for building modularized Android user interfaces, by Yelp.
Kotlin
306
star
23

Testify

A more pythonic testing framework.
Python
303
star
24

clusterman

Cluster Autoscaler for Kubernetes and Mesos
Python
295
star
25

kotlin-android-workshop

A Kotlin Workshop for engineers familiar with Java and Android development.
Kotlin
288
star
26

threat_intel

Threat Intelligence APIs
Python
264
star
27

nrtsearch

A high performance gRPC server on top of Apache Lucene
Java
254
star
28

python-gearman

Gearman API - Client, worker, and admin client interfaces
Python
242
star
29

py_zipkin

Provides utilities to facilitate the usage of Zipkin in Python
Python
225
star
30

fuzz-lightyear

A pytest-inspired, DAST framework, capable of identifying vulnerabilities in a distributed, micro-service ecosystem through chaos engineering testing and stateful, Swagger fuzzing.
Python
205
star
31

yelp-python

A Python library for the Yelp API
Python
182
star
32

venv-update

Synchronize your virtualenv quickly and exactly.
Python
178
star
33

firefly

Firefly is a web application aimed at powerful, flexible time series graphing for web developers.
JavaScript
171
star
34

amira

AMIRA: Automated Malware Incident Response & Analysis
Python
150
star
35

aactivator

Automatically source and unsource a project's environment
Python
145
star
36

YLTableView

Objective-C
144
star
37

love

A system to share your appreciation
Python
142
star
38

lemon-reset

Consistent, cross-browser React DOM tags, powered by CSS Modules. πŸ‹
JavaScript
131
star
39

dataloader-codegen

πŸ€– dataloader-codegen is an opinionated JavaScript library for automatically generating DataLoaders over a set of resources (e.g. HTTP endpoints).
TypeScript
110
star
40

bravado-core

Python
109
star
41

data_pipeline

Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.
Python
109
star
42

detect-secrets-server

Python
108
star
43

yelp-ruby

A Ruby gem for communicating with the Yelp REST API
Ruby
105
star
44

swagger_spec_validator

Python
104
star
45

ybinlogp

A fast mysql binlog parser
C
97
star
46

beans

Bringing people together, one cup of coffee at a time
Python
93
star
47

casper

A fast web application platform built in Rust and Luau
Rust
90
star
48

schematizer

A schema store service that tracks and manages all the schemas used in the Data Pipeline
Python
86
star
49

requirements-tools

requirements-tools contains scripts for working with Python requirements, primarily in applications.
Python
81
star
50

osxcollector_output_filters

Filters that process and transform the output of osxcollector
Python
77
star
51

sensu_handlers

Custom Sensu Handlers to support a multi-tenant environment, allowing checks themselves to emit the type of handler behavior they need in the event json
Ruby
75
star
52

graphql-guidelines

GraphQL @ Yelp Schema Guidelines
Makefile
74
star
53

kegmate

Arduino/iPad powered kegerator
Objective-C
72
star
54

ephemeral-port-reserve

Find an unused port, reliably
Python
68
star
55

parcelgen

Helpful tool to make data objects easier for Android
Python
65
star
56

salsa

A tool for exporting iOS components into Sketch πŸ“±πŸ’Ž
Swift
62
star
57

yelp-ios

Objective-C
61
star
58

docker-observium

Observium docker image with both professional and community edition support, ldap auth, and easy plugin support.
ApacheConf
58
star
59

yelp-android

Java
55
star
60

terraform-provider-signalform

SignalForm is a terraform provider to codify SignalFx detectors, charts and dashboards
Go
44
star
61

mycroft

Python
42
star
62

pidtree-bcc

eBPF tool for logging process ancestry of outbound TCP connections
Python
41
star
63

terraform-provider-gitfile

Terraform provider for checking out git repositories and making changes
Go
40
star
64

ffmpeg-android

Shell
39
star
65

pushmanager

Pushmanager is a web application to manage source code deployments.
Python
38
star
66

zygote

A Python HTTP process management utility.
Python
38
star
67

yelp_kafka

An extension of the kafka-python package that adds features like multiprocess consumers.
Python
38
star
68

pgctl

Manage sets of developer services -- "playground control"
Python
31
star
69

EMRio

Elastic MapReduce instance optimizer
Python
31
star
70

s3mysqldump

Dump mysql tables to s3, and parse them
Python
31
star
71

android-varanus

A client-side Android library to monitor and limit network traffic sent by your apps
Kotlin
29
star
72

pyramid_zipkin

Pyramid tween to add Zipkin service spans
Python
29
star
73

puppet-netstdlib

A collection of Puppet functions for interacting with the network
Ruby
27
star
74

sqlite3dbm

sqlite-backed dictionary conforming to the dbm interface
Python
27
star
75

send_nsca

Pure-python NSCA client
Python
26
star
76

docker-push-latest-if-changed

Python
26
star
77

data_pipeline_avro_util

Provides a Pythonic interface for reading and writing Avro schemas
Python
26
star
78

cocoapods-readonly

Automatically locks all CocoaPod source files.
Ruby
26
star
79

uwsgi_metrics

Python
26
star
80

WebImageView

An enhanced and improved ImageView for Android that displays images loaded over the interwebs
Java
25
star
81

task_processing

Interfaces and shared infrastructure for generic task processing at Yelp.
Python
23
star
82

PushmasterApp

(Legacy) Yelp pushmaster application built on Google App Engine
Python
22
star
83

tlspretense-service

A Docker container that exposes tlspretense on a port.
Makefile
20
star
84

puppet-uchiwa

Puppet module for installing Uchiwa
Ruby
20
star
85

yelp_cheetah

cheetah, hacked by yelpers
Python
20
star
86

logfeeder

Python
20
star
87

fido

Asynchronous HTTP client built on top of Crochet and Twisted
Python
20
star
88

swagger-spec-compatibility

Python library to check Swagger Spec backward compatibility
Python
20
star
89

pyramid-hypernova

A Python client for Airbnb's Hypernova server, for use with the Pyramid web framework.
Python
19
star
90

mr3po

protocols for use with mrjob
Python
16
star
91

YPFastDateParser

A class for parsing strings into NSDate instances, several times faster than NSDateFormatter
Objective-C
15
star
92

yelp_uri

Utilities for dealing with URIs, invented and maintained by Yelp.
Python
14
star
93

pysensu-yelp

A Python library to emit Sensu events that the Yelp Sensu Handlers can understand for Self-Service Sensu Monitoring
Python
14
star
94

terraform-provider-cloudhealth

Terraform provider for Cloudhealth
Go
14
star
95

yelp-rails-example

An example Rails application that uses the Yelp gem to integrate with the API
Ruby
13
star
96

named_decorator

Dynamically name wrappers based on their callees to untangle profiles of large python codebases
Python
12
star
97

pt-online-schema-change-plugins

Perl
11
star
98

environment_tools

Tools for programmatically describing Yelp's different environments (prod, dev, stage)
Python
11
star
99

puppet-cron

A super great cron Puppet module with timeouts, locking, monitoring, and more!
Ruby
11
star
100

pyswf

Python
10
star