Docker Issues and Tips (aufs/overlay/btrfs..)
Picked up and categorized subjectively from https://github.com/docker/docker/issues. Comments and pull requests are welcome.
โฌ = Open (maybe not up-to-date, please check the link by yourself!)
๐ณ = Mostly resolved (ditto, plus subjective)
โ = Resolved
Storage Drivers
AUFS
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #783 | Cannot access to a directory due to a permission error | ๐ Medium | ๐ Easy | Expected AUFS behavior. dirperm1 mount option fixes this issue. |
Update the kernel (AUFS >= 2008xxxx?) and Docker daemon (>= 1.7) | Confirm: `docker info |
โ #18180 | A process becomes a zombie and hangs up | ๐ฑ High | ๐ฑ Hard(multiprocessor) ๐ Easy(uniprocessor) |
Compatibility between the kernel and AUFS | Update the kernel (AUFS >= 20160111) | Java apps and MongoDB are known to be affected |
โ #20199 | fcntl(F_SETFL, O_APPEND) is ignored and hence data can be corrupted |
๐ฑ High | ๐ Easy | AUFS bug | Update the kernel (AUFS >= 20160301) | Dovecot is known to be affected |
โ #20240 | Weird permission even though dirperm1 is enabled |
๐ Medium | ๐ฑ Hard | AUFS bug | Update the kernel (AUFS >= 20160905) | |
โฌ AUFS ML 2016-03-08 | Hang up related to O_DIRECT |
๐ฑ High | ๐ Easy | Unanalyzed | None | Percona is known to be affected |
โฌ #24309 | Unable to remove files previously committed | ๐ฑ High | ๐ Easy | Unanalyzed | This article seems related, but perhaps slightly different(Japanese) | |
๐ณ #34361 | AUFS + XFS hangs up | ๐ฑ High | ๐ Easy | AUFS bug | Update AUFS |
Non-bug issues:
- AUFS is not available in the mainline kernel๏ผOnly a few distros (Ubuntu, Boot2Docker, ..) support AUFS, but even for Ubuntu, Canonical says "AUFS will disappear".
- No support for extended attributes ("xattrs"), and might not ever get support (#1070, #8460).
rename(2)
is not fully supported ( see also #aufs--overlay-common )
Overlay
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #10180 | RPMDB corruption | ๐ฑ High | ๐ Medium | Expected overlay behavior | Use yum-{utils,plugins-ovl}-1.1.31-33.el7 (included in RHEL 7.2) or later. Kernel patch is also available. | Linux 4.6 or later prints human-friendly dmesg |
โ #12080 | Cannot use UNIX domain sockets | ๐ Medium | ๐ Easy | Overlay Bug | Use Linux 4.7-rc4 or later | |
โ #12327 | pip fails | ๐ฑ High | ๐ Easy | Overlay Bug | Use Linux 4.5 or later | |
โ #19082 | Weird behavior after removing the current directory | ๐ Low | ๐ Easy | Overlay Bug | Use Linux 4.5 or later | |
๐ณ #19647, coreos/bugs#1095 | Untar fails intermittently | ๐ฑ High | ๐ฑ Hard | Overlay Bug | Use Linux 4.13 with OVERLAY_FS_INDEX=y | Analysis is in progress in coreos/bugs#1095 |
โฌ #20640 | Container cannot be started | ๐ Medium | ๐ฑ Hard | Unanalyzed | None | Possibly identical to #16902 |
โ #20950 | /dev/console: operation not permitted | ๐ฑ High | ๐ Easy | Kernel Bug | Use recent Linux kernels | |
โ #21555 | docker build fails intermittently (overlay1) |
๐ฑ High | ๐ฑ Hard | DiffDriver bug | Use Docker 1.13 or later | Overlay2 doesn't have this issue by design |
โ #24913 | permissions broken after chown | ๐ Medium | ๐ Easy | Overlay Bug | Use Linux 4.6 or later | The overlay2 issue #28391 is due to the identical bug |
โ #25244 | opaque flag not reset after directory copy up | ๐ Medium | ๐ Easy | Overlay Bug | Resolved in Linux 4.8 and backported to 4.4.21 and 4.7.4 | npm is known to be affected |
โ machine#3327 | chmod fails with EPERM | ๐ Low | ๐ Easy | Overlay Bug | Use Linux 4.5 or later | |
โ #27358 | file removal weird on overlay + XFS (ftype=0) | ๐ฑ High | ๐ Easy | Expected behavior | Format xfs with ftype=1 | |
โ #34320 | docker build produces weird images with CONFIG_OVERLAY_FS_REDIRECT_DIR=y |
๐ฑ High | ๐ Easy | DiffDriver issue | Apply #34342 (Docker 17.08?) |
Non-bug issues:
- ๐ฑ High inode usage (resolved in overlay2, which will be available in Docker 1.12)
- Red Hat says "OverlayFS remains a Technology Preview in Red Hat Enterprise Linux 7.3 under most circumstances"
rename(2)
is not fully supported ( see also #aufs--overlay-common )- MySQL doesn't work without
touch
-ing files under/var/lib/mysql
: docker/for-linux#72 (comment)
AUFS / Overlay common
Non-bug issue: rename(2)
is not fully supported #25409
reports about the incompatible behavior of rename(2)
from the real world
Software | Report |
---|---|
Apache Kudu | https://issues.apache.org/jira/browse/KUDU-1419 |
CernVM-FS | https://sft.its.cern.ch/jira/browse/CVM-651 |
GPG | moby/moby#26317 |
NPM | npm/npm#9863 |
Samba | https://bugzilla.samba.org/show_bug.cgi?id=9966 |
BtrFS
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #19073 | sendfile(2) can be unkillable |
๐ Low | ๐ Easy | BtrFS bug | None | Not likely to happen in production, but needs consideration for public PaaS |
โฌ #20080 | cgroups kmem limit leads crash and data corruption | ๐ฑ High | ๐ Easy? | Btrfs bug | Avoid kmem limit configuration? |
Non-bug issues:
- Slow #10161
- No page sharing (e.g. same DLLs are loaded redundantly) http://comments.gmane.org/gmane.comp.sysutils.docker.devel/1384
- Docker says BtrFS is Experimental. Red Hat says BtrFS is Tech Preview.
ZFS
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #20153 | Some operations fail due to EBUSY |
๐ Medium | ๐ Medium | Daemon bug | Update Docker daemon |
Non-bug issues:
- Docker says ZFS is not recommended for production.
DeviceMapper
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #4036 | Mount fails | ๐ฑ High | ๐ Easy | udev sync disabled | Use a Docker daemon binary which supports udev sync | Confirm: `docker info |
โฌ #20401 | Infinite โmount/remountโ loop, which makes the system unresponsive | ๐ฑ High | ๐ฑ High | Unanalyzed (perhaps related to XFS) | None |
Non-bug issues:
- Slow #10161
- No page sharing (e.g. same DLLs are loaded redundantly) http://comments.gmane.org/gmane.comp.sysutils.docker.devel/1384
Storage driver test tool
- dmcgowan/dsdbench: Docker Storage Driver Benchmarks and Tests
So which storage driver should I use?
It totally depends on your workload, but Docker, Inc. says AUFS and Devicemapper (direct-lvm) are "production-ready".
Although not listed in the above table, VFS driver is also attractive for its robustness.
Links:
- https://jpetazzo.github.io/assets/2015-03-03-not-so-deep-dive-into-docker-storage-drivers.html#1
- http://www.projectatomic.io/docs/filesystems/
- https://blog.jessfraz.com/post/the-brutally-honest-guide-to-docker-graphdrivers/
Anyway...
You know, containers should be "immutable" and "disposable".
For persistent data and some special temporary data, you should better consider using an external volume (docker run -v
).
Links:
Network
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
๐ณ #5618 | hang up with unregister_netdevice: waiting for lo to become free |
๐ฑ High | ๐ฑ Hard | Kernel bug | Use Linux 4.8 or later | The patch will be backported to old kernels in major distros |
โ #18776 | TCP checksums are ignored | ๐ฑ High | ๐ฑ Hard | Kernel bug | Use Linux 4.4 or later | blog |
Logging
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #19209 | GELF driver saturates CPU | ๐ฑ High | ๐ Easy | Compression | Disable compression | |
โ #18057,#20600 | cat /dev/zero leads to out of memory |
๐ฑ High | ๐ Easy | logger's stdio handling issue | Use Docker 1.13 or later (or just disable the logging) | Related: #21181 |
โฌ #22497 | container cannot be stopped if many logs are being printed | ๐ฑ High | ๐ฑ Hard | logger's stdio handling issue | ||
โ #22502 | logging blocks the container | ๐ฑ High | ๐ Easy | logger's stdio handling issue | Use Docker 1.11 or later | affected versions: 1.10.0 |
Others
Issue | Abstract | Impact | Reproducibility | Cause | Solution | Notes |
---|---|---|---|---|---|---|
โ #17720 | Docker daemon 1.9 serious performance issue | ๐ฑ High | ๐ฑ Hard | ? | Use Docker 1.10 | |
โฌ #19758 | soft lockup related to show_mountinfo() , after frequent docker run |
๐ฑ High | ๐ฑ Hard | Unanalyzed (Kernel bug related to the number of processors?) | None | |
โ #20670 | /dev/pts unmounted on the HOST when you are using -v /dev:/dev (After that you can no longer open SSH nor xterm) |
๐ฑ High | ๐ Easy | daemon bug related to mount namespace | Use Docker 1.11.1. (Or Spawn the docker daemon from systemd. Or do not use -v /dev:/dev ) |
|
โ #20836 | Daemon hangs up after frequent docker run |
๐ฑ High | ๐ฑ Hard | Daemon bug | Use Docker 1.11.1 | |
โ #28936 | Strange permission issues with named containers on 1.12.3 | ๐ฑ High | ๐ Easy | Daemon bug related to SELinux) | Use Docker 1.12.4 | |
โ Ubuntu linux-azure #1719045 | fatal error: unaligned sysUnused on Azure |
๐ฑ High | ? | Ubuntu linux-azure kernel bug | Use linux-azure 4.11.0-1013.13 or later |
Non-bug issues: