• Stars
    star
    100
  • Rank 340,703 (Top 7 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created almost 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Kubernetes controller that restarts Deployments when referenced ConfigMaps or Secrets change.

Kubernetes Deployment Restart Controller

This Kubernetes controller watches ConfigMaps and Secrets referenced by Deployments and StatefulSets and triggers restarts as soon as configuration or secret values change.

Installation

The k8s-manifests folder contains the necessary configuration files. Adjust them to your taste and apply in the given order:

kubectl apply -f k8s-manifests/rbac.yaml -f k8s-manifests/deployment.yaml

Optionally, create the metrics Service:

kubectl apply -f k8s-manifests/metrics-service.yaml

Configuration

Automatic restart functionality is enabled on per-Deployment (StatefulSet) basis.

The only thing you need to do is set the com.xing.deployment-restart annotation on the desired Deployment (or StatefulSet) to enabled:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    com.xing.deployment-restart: enabled
  # the rest of the deployment manifest

Controller monitors deployment manifests and automatically watches or stops watching relevant ConfigMaps and Secrets. It also stops restarting a deployment as soon as annotation is removed or changed to anything else than enabled.

Implementation Details

Kubernetes exhibits several constraints that shaped the implementation of the controller a great deal.

Even though Kubernetes assigns a version to every resource deployed to the cluster, it is impossible to figure out which version of a particular ConfigMap or Secret was used when a Pod was started. Because of that, the controller maintains its own dataset of configuration object versions applied to Deployments and StatefulSets. Kubernetes resource versions get incremented when any part of the resource definition is changed, not just the configuration data. To avoid unnecessary restarts, the controller calculates a configuration data checksum for every ConfigMap and Secret and uses it instead of resource version to detect changes.

Another constraint is related to resource discovery. Kubernetes Informers are asynchronous and do not guarantee that the client gets the most recent cluster changes immediately. The the order in which the client gets the resource definitions is also not fixed. This makes it necessary for the controller to always operate on a potentially incomplete view of the cluster state. For example, the controller can become aware of a new Deployment resource recently created in the cluster, but yet have no information on a ConfigMap referenced in the Deployment manifest.

How It Works

Every deployment configured for automatic restarts eventually gets updated with a checksums annotation. The annotation contains names and checksums of all the configs referenced by the deployment. The controller maintains this information and uses it to detect config changes.

The annotation itself is a JSON object:

metadata:
  annotations:
    com.xing.deployment-restart.applied-config-checksums: '{"configmap/namespace-one/config-one":"189832cc316e7594","secret/namespace-one/secret-two":"6e79832c18c31594"}'

The controller watches all ConfigMap, Secret, Deployment and StatefulSet resources in the cluster and builds a catalog of deployments and related configs in memory as resources become known to it.

Catalog entries for config resources can represent either actual resources in the cluster and have checksums associated with them, or they can be dummies merely stating the fact that some of the known deployment resources reference a particular config name.

Deployment entries in the catalog always represent real deployment objects in the cluster.

When receiving the information about another config or deployment, controller can instantiate a change object. There are several situations when it happens:

  • New config is added to the catalog.
  • New deployment is added to the catalog.
  • Known deployment references a config that it was not referencing before.
  • Known deployment does not reference a config that it was referencing before.
  • Known deployment has its checksums annotation changed.
  • Known config has its checksum changed.

Every change is identified by the name of the resource it was initiated by and has a timestamp and a counter associated with it.

Instantiated changes are stored in the queue without any actions to them for the amount of time determined by RESTART_GRACE_PERIOD setting. If the resource is changed again during the grace period, the change counter gets incremented.

After the grace period is exhausted, the change gets processed:

  1. Using the catalog, controller discovers deployments that are potentially affected by the change. A deployment change can only affect the deployment itself, while a config change affects all the deployments that reference the config.

  2. Every potentially affected deployment has its checksums annotation compared to the current state of the catalog. Based on the comparison, controller decides if the annotation should be updated and if the deployment needs to be restarted.

  3. If an annotation update or a restart is necessary, controller patches the deployment. It always issues a single patch request to update the annotation and restart the deployment at the same time if necessary. A restart is triggered by setting the com.xing.deployment-restart.timestamp annotation in spec.template.metadata.annotations of the deployment.

There are two situations when a deployment restart is triggered by a config update:

  1. The deployment already has a checksum of the updated config in the checksums annotation, and the new checksum of the config is different. This is the "normal" situation when a config that deployment has already been referencing for some time gets updated.

  2. The deployment does not have a checksum of the updated config in the checksums annotation, but the corresponding change counter is greater than one. This is the situation when a config got added to the deployment, but before the checksum got saved in the deployment annotation, the config got updated again.

Caveats

  1. Combined with other automation tools, controller can cause deployments to be restarted more often they need to be. For example, if some deployment pipeline takes longer than 5 seconds (by default) to update two ConfigMaps that are referenced by the same deployment, the deployment will be restarted twice. This can be mitigated by either changing the pipeline or increasing the restart check period.

  2. When forcefully terminated, the controller might miss some restarts. Consider the situation: a new deployment is added to the cluster. Soon after that, a config referenced by that deployment is updated, and while that change is being on hold for the grace period, controller gets forcefully terminated. The fact that there was a config change observed would not be stored anywhere, and another instance of the controller will just mark the config as already applied to the deployment. The chances of this happening are rather low since controller needs to be killed exactly during the grace period of a deployment change followed by a config change.

Runtime Metrics

The controller exposes several metrics at 0.0.0.0:10254/metrics endpoint in Prometheus format. These metrics can be used to monitor the controller status and observe actions that it takes.

Metric Type Description
deployment_restart_controller_resource_versions_total counter The number of distinct resource versions observed.
deployment_restart_controller_configs_total gauge The number of tracked configs.
deployment_restart_controller_deployments_total gauge The number of tracked deployments.
deployment_restart_controller_deployment_annotation_updates_total counter The number of deployment annotation updates.
deployment_restart_controller_deployment_restarts_total counter The number of deployment restarts triggered.
deployment_restart_controller_changes_processed_total counter The number of resource changes processed.
deployment_restart_controller_changes_waiting_total gauge The number of changes waiting in the queue.

Command Line Arguments

Usage:
  kubernetes-deployment-restart-controller [OPTIONS]

Application Options:
  -c, --restart-check-period= Time interval to check for pending restarts in milliseconds
                              (default: 500) [$RESTART_CHECK_PERIOD]
  -r, --restart-grace-period= Time interval to compact restarts in seconds (default: 5)
                              [$RESTART_GRACE_PERIOD]
      --ignored-errors=       List of error patterns to just warn of, instead of exiting the controller.
                              Useful if previously legal objects are not valid anymore but have not yet
                              been updated, e.g. on admission control changes. ENV var splits on ;
                              (semicolon). [$IGNORED_ERRORS]
  -v, --verbose=              Be verbose [$VERBOSE]
      --version               Print version information and exit

Help Options:
  -h, --help                  Show this help message

Development

This project uses go modules introduced by go 1.14. Please put the project somewhere outside of your GOPATH to make go automatically recogninze this.

All build and install steps are managed in the Makefile. make test will fetch external dependencies, compile the code and run the tests. If all goes well, hack along and submit a pull request. You might need to run the go mod tidy after updating dependencies.

Releases

Releases are a two-step process, beginning with a manual step:

The Travis CI run will then realize that the current tag refers to the current master commit and will tag the built docker image accordingly.

More Repositories

1

hops

Universal Development Environment
JavaScript
168
star
2

beetle

High Availability AMQP Messaging With Redundant Queues
Ruby
161
star
3

kubernetes-oom-event-generator

Generate a Kubernetes Event when a Pod's container has been OOMKilled
Go
161
star
4

rails_cursor_pagination

Add cursor pagination to your ActiveRecord backed application
Ruby
112
star
5

XNGMarkdownParser

A Markdown NSAttributedString parser.
Mercury
103
star
6

paperclip-storage-ftp

Allow Paperclip attachments to be stored on FTP servers
Ruby
53
star
7

mango

A fastlane plugin that lets you execute tasks in a managed docker container
Ruby
51
star
8

XNGPodsSynchronizer

Mirrors the CocoaPods you need to GitHub Enterprise
Ruby
40
star
9

kubernetes-event-forwarder-gelf

Forward Kubernetes Events using the Graylog Extended Log Format
Go
36
star
10

xing-android-sdk

The Official XING API client for Java/Android
Java
34
star
11

racktables_api

REST access to racktables objects
Ruby
33
star
12

calabash-launcher

iOS Calabash Launcher is a macOS app that helps you run and manage Calabash tests on your Mac.
Swift
30
star
13

XNGMacBuildNodeSetup

These are the ansible playbooks we use to configure our Mac machines for usage with Jenkins CI.
Shell
25
star
14

xing-api-samples

Code samples to get started with the XING API
HTML
24
star
15

pandan

Retrieve Xcode dependency information with ease
Ruby
24
star
16

jungle

Complexity metrics for Cocoapods and SwiftPM
Swift
21
star
17

cloudinit-vmware-guestinfo

Python
19
star
18

xing_api

Offical ruby client for the XING-API, providing easy access to all API endpoints and simplifies response parsing, error handling and tries to ease the oauth pain.
Ruby
19
star
19

selector-specificity

Mean selector specificity of a CSS style sheet
Ruby
18
star
20

fpm-fry

deep-fried package builder
Ruby
18
star
21

jira-bookmarklets

A collection of bookmarklets that helps you to print your JIRA tickets
JavaScript
16
star
22

vpim

a clone of the vpim project on rubyforge (patched geolocation support)
Ruby
16
star
23

israkel

Collection of common rake tasks for the iPhone Simulator.
Ruby
15
star
24

oembed_provider_engine

Mountable engine to turn your Rails 3 app into an oembed provider
Ruby
14
star
25

recrep

Generate crash reports with ease 👌
Rust
13
star
26

passenger-prometheus-exporter-app

Passenger application that exports Passenger metrics in a Prometheus format
HTML
11
star
27

mire

A Ruby method usage analyzer
Ruby
10
star
28

pdf_cover

Generate PDF cover images with Paperclip and Carrierwave
Ruby
8
star
29

cross-application-csrf-prevention

Cross-site request forgery prevention done right
8
star
30

jQuery.autocompletr

A new, better and more extensible jQuery plugin for autocompletion and (ajax) suggesting
JavaScript
8
star
31

perl-beetle

High availability AMQP messaging with redundant queues
Perl
8
star
32

UnifiedActionHandling

A sample project for the CocoaHeads HH presentation in January 2018. It shows our approach to handle actions like cell and button taps in a unified way.
Swift
8
star
33

amiando

Amiando Api Ruby implementation
Ruby
8
star
34

rbenv-patchsets

A collection of definitions for ruby-build to install rubies patched from https://github.com/skaes/rvm-patchsets for rbenv
Ruby
7
star
35

slouch

Backbone Scaffolding
Ruby
7
star
36

canarist

TypeScript
7
star
37

gem_version_check

Check gem dependencies of ruby apps and generate report
Ruby
7
star
38

java-beetle

Java version of the Beetle client library
Java
6
star
39

curl-useragent

UserAgent based on libcurl
Perl
6
star
40

rlsr

create npm releses and changelogs from a mono repo
TypeScript
5
star
41

test-spec-script

A script that converts tests from test/spec to Rails 2.3+ standard test syntax
4
star
42

elixir_logjam_agent

Logjam agent implementation for Elixir / Phoenix
Elixir
3
star
43

gearman-server

Gearman C Server
C
3
star
44

XNGOAuth1Client

Objective-C
3
star
45

logjam-agent-go

Golang logjam agent
Go
3
star
46

game_of_life_web

Web visualisation for Game of Life
Elixir
2
star
47

scratcher

Backbone lightweight framework giving simple structure to make life a little easier
JavaScript
2
star
48

XNGAPIClientTester

The client tester for the XINGAPIClient
Objective-C
2
star
49

cookiebara

session access for capybara
Ruby
2
star
50

xing_ad_tracker

Official XNGAdTracker plugin for iOS/Android
Kotlin
2
star
51

actionlint

Actionlint as wasm
JavaScript
1
star
52

rbenv-mygemfile

Tiny rbenv plugin which sets `BUNDLE_GEMFILE` to `Mygemfile` if file exists
Shell
1
star
53

unified_csrf_prevention

Self-healing, stateless and programming language agnostic CSRF prevention library for Rails
Ruby
1
star
54

msteams_hermes

Send messages to any Microsoft Teams' channel as easy as possible
Ruby
1
star
55

icu4r

Ruby Bindings for icu4c (derived from an installed version of icu4r)
C
1
star
56

btn

Bitrise Teams Notifier - Send out messages to MS Teams if Bitrise builds failed, succeeded, were aborted.
Ruby
1
star