• Stars
    star
    171
  • Rank 222,266 (Top 5 %)
  • Language
    Go
  • License
    MIT License
  • Created over 6 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Opinionated StackSet resource for managing application life cycle and traffic switching in Kubernetes

Kubernetes StackSet Controller

Build Status Coverage Status

The Kubernetes StackSet Controller is a concept (along with an implementation) for easing and automating application life cycle for certain types of applications running on Kubernetes.

It is not meant to be a generic solution for all types of applications but it's explicitly focusing on "Web Applications", that is, application which receive HTTP traffic and are continuously deployed with new versions which should receive traffic either instantly or gradually fading traffic from one version of the application to the next one. Think Blue/Green deployments as one example.

By default Kubernetes offers the Deployment resource type which, combined with a Service, can provide some level of application life cycle in the form of rolling updates. While rolling updates are a powerful concept, there are some limitations for certain use cases:

  • Switching traffic in a Blue/Green style is not possible with rolling updates.
  • Splitting traffic between versions of the application can only be done by scaling the number of Pods. E.g. if you want to give 1% of traffic to a new version, you need at least 100 Pods.
  • Impossible to run smoke tests against a new version of the application before it gets traffic.

To work around these limitations I propose a different type of resource called an StackSet which has the concept of Stacks.

The StackSet is a declarative way of describing the application stack as a whole, and the Stacks describe individual versions of the application. The StackSet also allows defining a "global" load balancer spanning all stacks of the stackset which makes it possible to switch traffic to different stacks at the load balancer (for example Ingress) level.

                                 +-----------------------+
                                 |                       |
                                 |     Load Balancer     |
                                 | (for example Ingress) |
                                 |                       |
                                 +--+--------+--------+--+
                                    | 0%     | 20%    | 80%
                      +-------------+        |        +------------+
                      |                      |                     |
            +---------v---------+  +---------v---------+  +--------v----------+
            |                   |  |                   |  |                   |
            |       Stack       |  |       Stack       |  |      Stack        |
            |     Version 1     |  |     Version 2     |  |     Version 3     |
            |                   |  |                   |  |                   |
            +-------------------+  +-------------------+  +-------------------+

The StackSet and Stack resources are implemented as CRDs. A StackSet looks like this:

apiVersion: zalando.org/v1
kind: StackSet
metadata:
  name: my-app
spec:
  # optional Ingress definition.
  ingress:
    hosts: [my-app.example.org, alt.name.org]
    backendPort: 80
  # optional desired traffic weights defined by stack
  traffic:
  - stackName: mystack-v1
    weight: 80
  - stackName: mystack-v2
    weight: 20
  # optional percentage of required Replicas ready to allow traffic switch
  # if none specified, defaults to 100
  minReadyPercent: 90
  stackLifecycle:
    scaledownTTLSeconds: 300
    limit: 5 # maximum number of scaled down stacks to keep.
             # If there are more than `limit` stacks, the oldest stacks which are scaled down
             # will be deleted.
  stackTemplate:
    spec:
      version: v1 # version of the Stack.
      replicas: 3
      # optional autoscaler definition (will create an HPA for the stack).
      autoscaler:
        minReplicas: 3
        maxReplicas: 10
        metrics:
        - type: cpu
          averageUtilization: 50
      # full Pod template.
      podTemplate:
        spec:
          containers:
          - name: skipper
            image: ghcr.io/zalando/skipper:latest
            args:
            - skipper
            - -inline-routes
            - '* -> inlineContent("OK") -> <shunt>'
            - -address=:80
            ports:
            - containerPort: 80
              name: ingress
            resources:
              limits:
                cpu: 10m
                memory: 50Mi
              requests:
                cpu: 10m
                memory: 50Mi

The above StackSet would generate a Stack that looks like this:

apiVersion: zalando.org/v1
kind: Stack
metadata:
  name: my-app-v1
  labels:
    stackset: my-app
    stackset-version: v1
spec:
  replicas: 3
  autoscaler:
    minReplicas: 3
    maxReplicas: 10
    metrics:
    - type: cpu
      averageUtilization: 50
  podTemplate:
    spec:
      containers:
        image: ghcr.io/zalando/skipper:latest
        args:
        - skipper
        - -inline-routes
        - '* -> inlineContent("OK") -> <shunt>'
        - -address=:80
        ports:
        - containerPort: 80
          name: ingress
        resources:
          limits:
            cpu: 10m
            memory: 50Mi
          requests:
            cpu: 10m
            memory: 50Mi

For each Stack a Service and Deployment resource will be created automatically with the right labels. The service will also be attached to the "global" Ingress if the stack is configured to get traffic. An optional HorizontalPodAutoscaler resource can also be created per stack for horizontally scaling the deployment.

For the most part the Stacks will be dynamically managed by the system and the users don't have to touch them. You can think of this similar to the relationship between Deployments and ReplicaSets.

If the Stack is deleted the related resources like Service and Deployment will be automatically cleaned up.

The stackLifecycle let's you configure two settings to change the cleanup behavior for the StackSet:

  • scaleDownTTLSeconds defines for how many seconds a stack should not receive traffic before it's scaled down.
  • limit defines the total number of stacks to keep. That is, if you have a limit of 5 and currently have 6 stacks for the StackSet then it will clean up the oldest stack which is NOT getting traffic. The limit is not enforced if it would mean deleting a stack with traffic. E.g. if you set a limit of 1 and have two stacks with 50% then none of them would be deleted. However, if you switch to 100% traffic for one of the stacks then the other will be deleted after it has not received traffic for scaleDownTTLSeconds.

Features

  • Automatically create new Stacks when the StackSet is updated with a new version in the stackTemplate.
  • Do traffic switching between Stacks at the Ingress layer, if you have the ingress definition in the spec. Ingress resources are automatically updated when new stacks are created. (This require that your ingress controller implements the annotation zalando.org/backend-weights: {"my-app-1": 80, "my-app-2": 20}, for example use skipper for Ingress) or read the information from stackset status.traffic.
  • Safely switch traffic to scaled down stacks. If a stack is scaled down, it will be scaled up automatically before traffic is directed to it.
  • Dynamically provision Ingresses per stack, with per stack host names. I.e. my-app.example.org, my-app-v1.example.org, my-app-v2.example.org.
  • Automatically scale down stacks when they don't get traffic for a specified period.
  • Automatically delete stacks that have been scaled down and are not getting any traffic for longer time.
  • Automatically clean up all dependent resources when a StackSet or Stack resource is deleted. This includes Service, Deployment, Ingress and optionally HorizontalPodAutoscaler.
  • Command line utility (traffic) for showing and switching traffic between stacks.
  • You can opt-out of the global Ingress creation with externalIngress: spec, such that external controllers can manage the Ingress or CRD creation, that will configure the routing into the cluster.
  • You can use skipper's RouteGroups to configure more complex routing rules.

Docs

Kubernetes Compatibility

The StackSet controller works with Kubernetes >=v1.23.

How it works

The controller watches for StackSet resources and creates Stack resources whenever the version is updated in the StackSet stackTemplate. For each StackSet it will create an optional "main" Ingress resource and keep it up to date when new Stacks are created for the StackSet. For each Stack it will create a Deployment, a Service and optionally an HorizontalPodAutoscaler for the Deployment. These resources are all owned by the Stack and will be cleaned up if the stack is deleted.

Setup

Use an existing cluster or create a test cluster with kind

kind create cluster --name testcluster001

The stackset-controller can be run as a deployment in the cluster. See deployment.yaml.

The controller depends on the StackSet and Stack CRDs. You must install these into your cluster before running the controller:

$ kubectl apply -f docs/stackset_crd.yaml -f docs/stack_crd.yaml

After the CRDs are installed the controller can be deployed:

please adjust the controller version and cluster-domain to your environment

$ kubectl apply -f docs/rbac.yaml -f docs/deployment.yaml

Custom configuration

controller-id

There are cases where it might be desirable to run multiple instances of the stackset-controller in the same cluster, e.g. for development.

To prevent the controllers from fighting over the same StackSet resources they can be configured with the flag --controller-id=<some-id> which indicates that the controller should only manage the StackSets which has an annotation stackset-controller.zalando.org/controller=<some-id> defined. If the controller-id is not configured, the controller will manage all StackSets which does not have the annotation defined.

Quick intro

Once you have deployed the controller you can create your first StackSet resource:

$ kubectl apply -f docs/stackset.yaml
stackset.zalando.org/my-app created

This will create the stackset in the cluster:

$ kubectl get stacksets
NAME          CREATED AT
my-app        21s

And soon after you will see the first Stack of the my-app stackset:

$ kubectl get stacks
NAME                  CREATED AT
my-app-v1             30s

It will also create Ingress, Service, Deployment and HorizontalPodAutoscaler resources:

$ kubectl get ingress,service,deployment.apps,hpa -l stackset=my-app
NAME                           HOSTS                   ADDRESS                                  PORTS     AGE
ingress.extensions/my-app      my-app.example.org      kube-ing-lb-3es9a....elb.amazonaws.com   80        7m
ingress.extensions/my-app-v1   my-app-v1.example.org   kube-ing-lb-3es9a....elb.amazonaws.com   80        7m

NAME                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)  AGE
service/my-app-v1   ClusterIP   10.3.204.136   <none>        80/TCP   7m

NAME                        DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/my-app-v1   1         1         1            1           7m

NAME                                            REFERENCE              TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/my-app-v1   Deployment/my-app-v1   <unknown>/50%   3         10        0          20s

Imagine you want to roll out a new version of your stackset. You can do this by changing the StackSet resource. E.g. by changing the version:

$ kubectl patch stackset my-app --type='json' -p='[{"op": "replace", "path": "/spec/stackTemplate/spec/version", "value": "v2"}]'
stackset.zalando.org/my-app patched

Soon after, we will see a new stack:

$ kubectl get stacks -l stackset=my-app
NAME        CREATED AT
my-app-v1   14m
my-app-v2   46s

And using the traffic tool we can see how the traffic is distributed (see below for how to build the tool):

./build/traffic my-app
STACK          TRAFFIC WEIGHT
my-app-v1      100.0%
my-app-v2      0.0%

If we want to switch 100% traffic to the new stack we can do it like this:

# traffic <stackset> <stack> <traffic>
./build/traffic my-app my-app-v2 100
STACK          TRAFFIC WEIGHT
my-app-v1      0.0%
my-app-v2      100.0%

Since the my-app-v1 stack is no longer getting traffic it will be scaled down after some time and eventually deleted.

If you want to delete it manually, you can simply do:

$ kubectl delete appstack my-app-v1
stacksetstack.zalando.org "my-app-v1" deleted

And all the related resources will be gone shortly after:

$ kubectl get ingress,service,deployment.apps,hpa -l stackset=my-app,stackset-version=v1
No resources found.

Building

This project uses Go modules as introduced in Go 1.11 therefore you need Go >=1.11 installed in order to build. If using Go 1.11 you also need to activate Module support.

Assuming Go has been setup with module support it can be built simply by running:

$ export GO111MODULE=on # needed if the project is checked out in your $GOPATH.
$ make

Note that the Go client interface for talking to the custom StackSet and Stack CRD is generated code living in pkg/client/ and pkg/apis/zalando.org/v1/zz_generated_deepcopy.go. If you make changes to pkg/apis/* then you must run make clean && make to regenerate the code.

To understand how this works see the upstream example for generating client interface code for CRDs.

Upgrade

<= v1.0.0 to >= v1.1.0

Clients that write the desired traffic switching value have to move from ingress annotation zalando.org/stack-traffic-weights: '{"mystack-v1":80, "mystack-v2": 20}' to stackset spec.traffic:

spec:
  traffic:
  - stackName: mystack-v1
    weight: 80
  - stackName: mystack-v2
    weight: 20

More Repositories

1

graphql-jit

GraphQL execution using a JIT compiler
TypeScript
1,048
star
2

kopf

A Python framework to write Kubernetes operators in just few lines of code.
Python
969
star
3

kubernetes-on-aws

Deploying Kubernetes on AWS with CloudFormation and Ubuntu
Go
624
star
4

kube-metrics-adapter

General purpose metrics adapter for Kubernetes HPA metrics
Go
521
star
5

kube-ingress-aws-controller

Configures AWS Load Balancers according to Kubernetes Ingress resources
Go
375
star
6

es-operator

Kubernetes Operator for Elasticsearch
Go
352
star
7

hexo-theme-doc

A documentation theme for the Hexo blog framework
JavaScript
247
star
8

cluster-lifecycle-manager

Cluster Lifecycle Manager (CLM) to provision and update multiple Kubernetes clusters
Go
230
star
9

docker-locust

Docker image for the Locust.io open source load testing tool
Python
205
star
10

remora

Kafka consumer lag-checking application for monitoring, written in Scala and Akka HTTP; a wrap around the Kafka consumer group command. Integrations with Cloudwatch and Datadog. Authentication recently added
Scala
197
star
11

kube-aws-iam-controller

Distribute different AWS IAM credentials to different pods in Kubernetes via secrets.
Go
157
star
12

tessellate

Server-side React render service.
JavaScript
151
star
13

transformer

A tool to transform/convert web browser sessions (HAR files) into Locust load testing scenarios (locustfile).
Python
99
star
14

bro-q

Chrome Extension for JSON formatting and jq filtering in your browser.
TypeScript
83
star
15

spark-json-schema

JSON schema parser for Apache Spark
Scala
81
star
16

catwatch

A metrics dashboard for GitHub organizations, with results accessible via REST API
Java
59
star
17

authmosphere

A library to support OAuth2 workflows in JavaScript projects
TypeScript
54
star
18

banknote

A simple JavaScript libary for formatting currency amounts according to Unicode CLDR standards
JavaScript
46
star
19

flatjson

A fast JSON parser (and builder)
Java
45
star
20

perron

A sane node.js client for web services
JavaScript
44
star
21

zelt

A command-line tool for orchestrating the deployment of Locust in Kubernetes.
Python
36
star
22

hexo-theme-doc-seed

skeleton structure for a documentation website using Hexo and the hexo-doc-theme
29
star
23

kubernetes-log-watcher

Kubernetes log watcher for Scalyr and AppDynamics
Python
27
star
24

new-project

Template to use when creating a new open source project. It comes with all the standard files which there is expected to be in an open source project on Github.
24
star
25

darty

Data dependency manager
Python
22
star
26

chisel

βš’οΈ collection of awesome practices for putting things on pedestal
Clojure
20
star
27

fabric-gateway

An API Gateway built on the Skipper Ingress Controller https://github.com/zalando/skipper
Scala
17
star
28

roadblock

A node.js application for pulling github organisation statistics into a database.
JavaScript
16
star
29

ember-dressy-table

An ember addon for dynamic tables
JavaScript
10
star
30

zalando.github.io-dev

The zalando.github.io open-source metrics dashboard
JavaScript
10
star
31

atlas-js-core

JavaScript SDK Core for Zalando Checkout, Guest Checkout, and Catalog APIs
JavaScript
9
star
32

opentracing-sqs-java

An attempt at a simple SQS helper library for OpenTracing support.
Java
8
star
33

clin

Cli for Nakadi for event types and subscriptions management
Python
7
star
34

play-etcd-watcher

Instantaneous etcd directory listener for Scala Play
Scala
6
star
35

Zincr

Zincr is a Github bot built with Probot to enforce approvals, specification and licensing checks
TypeScript
5
star
36

jzon

Apis for working with json
Java
5
star
37

Trafficlight

Node.js CLI for creating and migrating Github projects, ensuring that it follows a consistent model for permissions, teams and boilerplate files.
JavaScript
1
star