• This repository has been archived on 24/Feb/2023
  • Stars
    star
    716
  • Rank 63,241 (Top 2 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 10 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Wrapper for "docker run" to handle systemd quirks

systemd-docker

This is a wrapper for docker run so that you can sanely run Docker containers under systemd. The key thing that this wrapper does is move the container process from the cgroups setup by Docker to the service unit's cgroup. This handles a bunch of other quirks so please read through documentation to get an understanding of all the implications of running Docker under systemd.

Using this wrapper you can manage containers through systemctl or the docker CLI and everything should just stay in sync. Additionally you can leverage all the cgroup functionality of systemd and systemd-notify.

Why I wrote this?

The full context is in Docker Issue #6791 and this mailing list thread. The short of it is that systemd does not actually supervise the Docker container but instead the Docker client. This makes systemd incapable of reliably managing Docker containers without hitting a bunch of really odd situations.

Installation

Copy systemd-docker to /opt/bin (really anywhere you want). You can download/compile through the normal go get github.com/ibuildthecloud/systemd-docker

Quick Usage

Basically, in your unit file use systemd-docker run instead of docker run. Here's an example unit file that runs nginx.

[Unit]
Description=Nginx
After=docker.service
Requires=docker.service

[Service]
ExecStart=/opt/bin/systemd-docker run --rm --name %n nginx
Restart=always
RestartSec=10s
Type=notify
NotifyAccess=all
TimeoutStartSec=120
TimeoutStopSec=15

[Install]
WantedBy=multi-user.target

If you are writing your own unit file, Type=notify and NotifyAccess=all is really important

Special Note About Named Containers

In short, it's best to always have --name %n --rm in your unit file's ExecStart.

The best way I've found to run containers under systemd is to always assign the container a name. Even better is to put --name %n in your unit file and then the name of the container will match the name of the service unit.

If you don't name your container, you will essentially be creating a new container on every start that will get orphaned. You're probably clever and thinking you can just add --rm and that will take care of the orphans. The problem with this is that --rm is not super reliable. By naming your container, systemd-docker will take extra care to keep the systemd unit and the container in sync. For example, if you do --name %n --rm, systemd-docker will ensure that the container is really deleted each time. The issue with --rm is that the remove is done from the client side. If the client dies, the container is not deleted.

If you do --name %n --rm, systemd-docker on start will look for the named container. If it exists and is stopped, it will be deleted. This is really important if you ever change your unit file. If you change your ExecStart command, and it is a named container, the old values will be saved in the stopped container. By ensuring the container is always deleted, you ensure the args in ExecStart are always in sync.

Options

Logging

By default the container's stdout/stderr will be piped to the journal. If you do not want to use the journal, add --logs=false to the beginning of the command. For example:

ExecStart=/opt/bin/systemd-docker --logs=false run --rm --name %n nginx

Environment Variables

Using Environment= and EnvironmentFile=, systemd can set up environment variables for you, but then unfortunately you have to do run -e ABC=${ABC} -e XYZ=${XYZ} in your unit file. You can have the systemd environment variables automatically transfered to your docker container by adding --env. This will essentially read all the current environment variables and add the appropriate -e ... flags to your docker run command. For example:

EnvironmentFile=/etc/environment
ExecStart=/opt/bin/systemd-docker --env run --rm --name %n nginx

The contents of /etc/environment will be added to your docker run command

Cgroups

The main magic of how this works is that the container processes are moved from the Docker cgroups to the system unit cgroups. By default all application cgroups will be moved. This means by default you can't use --cpuset or -m in Docker. If you don't want to use the systemd cgroups, but instead use the Docker cgroups, you can control which cgroups are transfered using the --cgroups option. Minimally you must set name=systemd; otherwise, systemd will lose track of the container. For example

ExecStart=/opt/bin/systemd-docker --cgroups name=systemd --cgroups=cpu run --rm --name %n nginx

The above command will use the name=systemd and cpu cgroups of systemd but then use Docker's cgroups for all the others, like the freezer cgroup.

Pid File

If for whatever reason you want to create a pid file for the container PID, you can. Just add --pid-file as below

ExecStart=/opt/bin/systemd-docker --pid-file=/var/run/%n.pid --env run --rm --name %n nginx

systemd-notify support

By default systemd-docker will send READY=1 to the systemd notification socket. You can instead delegate the READY=1 call to the container itself. This is done by adding --notify. For example

ExecStart=/opt/bin/systemd-docker --notify run --rm --name %n nginx

What this will do is set up a bind mount for the notification socket and then set the NOTIFY_SOCKET environment variable. If you are going to use this feature of systemd, take some time to understand the quirks of it. More info in this mailing list thread. In short, systemd-notify is not reliable because often the child dies before systemd has time to determine which cgroup it is a member of

Detaching the client

The -d argument to docker has no effect under systemd-docker. To cause the systemd-docker client to detach after the container is running, simply use --logs=false --rm=false. If either --logs or --rm is true, the systemd-docker client will stay alive until it is killed or the container exits.

Running on CoreOS

If you are running on CoreOS, it may be more problematic to install systemd-docker to /opt/bin. To make this easier add the following line to your unit file.

ExecStartPre=/usr/bin/docker run --rm -v /opt/bin:/opt/bin ibuildthecloud/systemd-docker

That command will install systemd-docker to /opt/bin. The full nginx example that is above would now be as below.

[Unit]
Description=Nginx
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker run --rm -v /opt/bin:/opt/bin ibuildthecloud/systemd-docker
ExecStart=/opt/bin/systemd-docker run --rm --name %n nginx
Restart=always
RestartSec=10s
Type=notify
NotifyAccess=all
TimeoutStartSec=120
TimeoutStopSec=15

[Install]
WantedBy=multi-user.target

Known issues

Inconsistent cgroup

CentOS 7 is inconsistent in the way it handles some cgroups. It has 3:cpuacct,cpu:/user.slice in /proc/[pid]/cgroups which is inconsistent with the cgroup path /sys/fs/cgroup/cpu,cpuacct/ that systemd-docker is trying to move pids to.

This will cause systemd-docker to fail unless run withsystemd-docker --cgroups name=systemd run

See #15 for details.

License

Apache License, Version 2.0

More Repositories

1

k3v

Virtual Kubernetes
Go
684
star
2

klum

Kubernetes Lazy User Manager
Go
295
star
3

only-docker

Docker as PID 1... what?!
Shell
161
star
4

finalizers

Stupid Finalizers
Go
108
star
5

coreos-on-do

Script to install CoreOS on Digital Ocean
Shell
97
star
6

wtfk8s

Watch and print changes in k8s
Go
82
star
7

gitbacked-controller

Write controller-runtime based k8s controllers that read/write to git, not k8s
Go
50
star
8

kvsql

Storage backend for Kubernetes using Go database/sql
Go
35
star
9

cowbell

Simple web hooks for rancher
Go
16
star
10

k3s-operator

Stupid simple controller to create local k3s clusters
Go
9
star
11

jenkins-acorn

An Acorn for a Jenkins server running against Kubernetes
Dockerfile
7
star
12

os2

not OS/2
Dockerfile
7
star
13

rancher-charts

Clone of Helm stable charts
Smarty
5
star
14

iwontbuyadomain

I won't buy a domain.
HTML
4
star
15

webhook-demo

Demo webhook application for Acorn
JavaScript
4
star
16

docker-ubuntu-kvm

Scripts used to create ibuildthecloud/ubuntu-kvm
Shell
4
star
17

steve-example

Go
3
star
18

wsudp

An ever so important websocket to UDP bridge to connect quakejs to a native quake3 server
Go
3
star
19

home

My ${HOME}
Shell
3
star
20

k3os

nothing to see here
Go
3
star
21

baaah

K8s Controller Framework made out of pure frustration
Go
3
star
22

my-repo

JavaScript
2
star
23

herd

Cute Fluffy Portable Apps
Go
2
star
24

wonka

Running dockerized apps like a pro
Go
2
star
25

cros

my devscripts for chromiumos
Shell
1
star
26

ibuildthecloud.github.io

HTML
1
star
27

docker-ubuntu-core

Script used for docker image ibuildthecloud/ubuntu-core
Shell
1
star
28

docker-networking-notes

Business, Business, Business, Numbers... Is this working?
1
star
29

dstack

A sandbox for Darren's opinions and code
Java
1
star
30

fleet-simulator

Shell
1
star
31

acs-launcher

Simple main class to launch Apache CloudStack in Eclipse in a fast way
Java
1
star
32

fleet-kitchensink

Deploy everything (this takes a lot of memory)
1
star