• Stars
    star
    502
  • Rank 84,566 (Top 2 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

In your Kubernetes, upgrading your nodes

System Upgrade Controller

Introduction

This project aims to provide a general-purpose, Kubernetes-native upgrade controller (for nodes). It introduces a new CRD, the Plan, for defining any and all of your upgrade policies/requirements. A Plan is an outstanding intent to mutate nodes in your cluster. For up-to-date details on defining a plan please review v1/types.go.

diagram

Presentations and Recordings

April 14, 2020

CNCF Member Webinar: Declarative Host Upgrades From Within Kubernetes

March 4, 2020

Rancher Online Meetup: Automating K3s Cluster Upgrades

Considerations

Purporting to support general-purpose node upgrades (essentially, arbitrary mutations) this controller attempts minimal imposition of opinion. Our design constraints, such as they are:

  • content delivery via container image a.k.a. container command pattern
  • operator-overridable command(s)
  • a very privileged job/pod/container:
    • host IPC, NET, and PID
    • CAP_SYS_BOOT
    • host root file-system mounted at /host (read/write)
  • optional opt-in/opt-out via node labels
  • optional cordon/drain a la kubectl

Additionally, one should take care when defining upgrades by ensuring that such are idempotent--there be dragons.

Deploying

The most up-to-date manifest is usually manifests/system-upgrade-controller.yaml but since release v0.4.0 a manifest specific to the release has been created and uploaded to the release artifacts page. See releases/download/v0.4.0/system-upgrade-controller.yaml

But in the time-honored tradition of curl ${script} | sudo sh - here is a nice one-liner:

# Y.O.L.O.
kustomize build github.com/rancher/system-upgrade-controller | kubectl apply -f - 

Example Plans

Below is an example Plan developed for k3OS that implements something like an rsync of content from the container image to the host, preceded by a remount if necessary, immediately followed by a reboot.

---
apiVersion: upgrade.cattle.io/v1
kind: Plan

metadata:
  # This `name` should be short but descriptive.
  name: k3os-latest

  # The same `namespace` as is used for the system-upgrade-controller Deployment.
  namespace: k3os-system

spec:
  # The maximum number of concurrent nodes to apply this update on.
  concurrency: 1

  # The value for `channel` is assumed to be a URL that returns HTTP 302 with the last path element of the value
  # returned in the Location header assumed to be an image tag (after munging "+" to "-").
  channel: https://github.com/rancher/k3os/releases/latest

  # Providing a value for `version` will prevent polling/resolution of the `channel` if specified.
  version: v0.10.0

  # Select which nodes this plan can be applied to.
  nodeSelector:
    matchExpressions:
      # This limits application of this upgrade only to nodes that have opted in by applying this label.
      # Additionally, a value of `disabled` for this label on a node will cause the controller to skip over the node.
      # NOTICE THAT THE NAME PORTION OF THIS LABEL MATCHES THE PLAN NAME. This is related to the fact that the
      # system-upgrade-controller will tag the node with this very label having the value of the applied plan.status.latestHash.
      - {key: plan.upgrade.cattle.io/k3os-latest, operator: Exists}
      # This label is set by k3OS, therefore a node without it should not apply this upgrade.
      - {key: k3os.io/mode, operator: Exists}
      # Additionally, do not attempt to upgrade nodes booted from "live" CDROM.
      - {key: k3os.io/mode, operator: NotIn, values: ["live"]}

  # The service account for the pod to use. As with normal pods, if not specified the `default` service account from the namespace will be assigned.
  # See https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
  serviceAccountName: k3os-upgrade

  # Specify which node taints should be tolerated by pods applying the upgrade.
  # Anything specified here is appended to the default of:
  # - {key: node.kubernetes.io/unschedulable, effect: NoSchedule, operator: Exists}
  tolerations:
  - {key: kubernetes.io/arch, effect: NoSchedule, operator: Equal, value: amd64}
  - {key: kubernetes.io/arch, effect: NoSchedule, operator: Equal, value: arm64}
  - {key: kubernetes.io/arch, effect: NoSchedule, operator: Equal, value: s390x}

  # The prepare init container, if specified, is run before cordon/drain which is run before the upgrade container.
  # Shares the same format as the `upgrade` container.
  prepare:
    # If not present, the tag portion of the image will be the value from `.status.latestVersion` a.k.a. the resolved version for this plan.
    image: alpine:3.18
    command: [sh, -c]
    args: ["echo '### ENV ###'; env | sort; echo '### RUN ###'; find /run/system-upgrade | sort"]

  # If left unspecified, no drain will be performed.
  # See:
  # - https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/
  # - https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#drain
  drain:
    # deleteLocalData: true  # default
    # ignoreDaemonSets: true # default
    force: true
    # Use `disableEviction == true` and/or `skipWaitForDeleteTimeout > 0` to prevent upgrades from hanging on small clusters.
    # disableEviction: false # default, only available with kubectl >= 1.18
    # skipWaitForDeleteTimeout: 0 # default, only available with kubectl >= 1.18

  # If `drain` is specified, the value for `cordon` is ignored.
  # If neither `drain` nor `cordon` are specified and the node is marked as `schedulable=false` it will not be marked as `schedulable=true` when the apply job completes.
  cordon: true

  upgrade:
    # If not present, the tag portion of the image will be the value from `.status.latestVersion` a.k.a. the resolved version for this plan.
    image: rancher/k3os
    command: [k3os, --debug]
    # It is safe to specify `--kernel` on overlay installations as the destination path will not exist and so the
    # upgrade of the kernel component will be skipped (with a warning in the log).
    args:
      - upgrade
      - --kernel
      - --rootfs
      - --remount
      - --sync
      - --reboot
      - --lock-file=/host/run/k3os/upgrade.lock
      - --source=/k3os/system
      - --destination=/host/k3os/system

Building

make

Running

Use ./bin/system-upgrade-controller.

Also see manifests/system-upgrade-controller.yaml that spells out what a "typical" deployment might look like with default environment variables that parameterize various operational aspects of the controller and the resources spawned by it.

Testing

Integration tests are bundled as a Sonobuoy plugin that expects to be run within a pod. To verify locally:

make e2e

This will, via Dapper, stand up a local cluster (using docker-compose) and then run the Sonobuoy plugin against/within it. The Sonobuoy results are parsed and a Status: passed results in a clean exit, whereas Status: failed exits non-zero.

Alternatively, if you have a working cluster and Sonobuoy installation, provided you've pushed the images (consider building with something like make REPO=dweomer TAG=dev), then you can run the e2e tests thusly:

sonobuoy run --plugin dist/artifacts/system-upgrade-controller-e2e-tests.yaml --wait
sonobuoy results $(sonobuoy retrieve)

License

Copyright (c) 2019-2022 Rancher Labs, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

More Repositories

1

rancher

Complete container management platform
Go
22,538
star
2

os

Tiny Linux distro that runs the entire OS as Docker containers
Go
6,437
star
3

k3os

Purpose-built OS for Kubernetes, fully managed by Kubernetes.
Go
3,403
star
4

rke

Rancher Kubernetes Engine (RKE), an extremely simple, lightning fast Kubernetes distribution that runs entirely within containers.
Go
3,138
star
5

rio

Application Deployment Engine for Kubernetes
Go
2,282
star
6

local-path-provisioner

Dynamically provisioning persistent local storage with Kubernetes
Go
1,938
star
7

fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
Go
1,450
star
8

convoy

A Docker volume plugin, managing persistent container volumes.
Go
1,308
star
9

rke2

Go
1,028
star
10

old-vm

(OBSOLETE) Package and Run Virtual Machines as Docker Containers
Go
646
star
11

ui

Rancher UI
JavaScript
587
star
12

cattle

Infrastructure orchestration engine for Rancher 1.x
Java
574
star
13

k3c

Lightweight local container engine for container development
Go
571
star
14

dashboard

The Rancher UI
Vue
410
star
15

community-catalog

Catalog entries contributed by the community
Smarty
384
star
16

charts

Github based Helm Chart Index Repository providing charts crafted for Rancher Manager
Smarty
381
star
17

install-docker

Scripts for docker-machine to install a particular docker version
Shell
361
star
18

dapper

Docker build wrapper
Go
358
star
19

quickstart

HCL
357
star
20

cli

Rancher CLI
Go
331
star
21

terraform-provider-rke

Terraform provider plugin for deploy kubernetes cluster by RKE(Rancher Kubernetes Engine)
Go
328
star
22

opni

Multi Cluster Observability with AIOps
Go
323
star
23

kim

In ur kubernetes, buildin ur imagez
Go
323
star
24

trash

Minimalistic Go vendored code manager
Go
296
star
25

terraform-controller

Use K8s to Run Terraform
Go
290
star
26

remotedialer

HTTP in TCP in Websockets in HTTP in TCP, Tunnel all the things!
Go
255
star
27

elemental-toolkit

❄️ The toolkit to build, ship and maintain cloud-init driven Linux derivatives based on container images
Go
251
star
28

elemental

Elemental is an immutable Linux distribution built to run Rancher and its corresponding Kubernetes distributions RKE2 and k3s. It is built using the Elemental-toolkit
Go
228
star
29

terraform-provider-rancher2

Terraform Rancher2 provider
Go
222
star
30

rancher-compose

Docker compose compatible client to deploy to Rancher
Go
214
star
31

wrangler

Write controllers like a boss
Go
205
star
32

os-vagrant

Ruby
176
star
33

rancher-catalog

Smarty
155
star
34

docs

Documentation for Rancher products (for 2.0/new site)
SCSS
140
star
35

fleet-examples

Fleet usage examples
Shell
140
star
36

catalog-dockerfiles

Dockerfiles for Rancher Catalog containers
Shell
131
star
37

rancher-cleanup

Shell
125
star
38

api-spec

Specification for Rancher REST API implementation
121
star
39

k8s-intro-training

HTML
114
star
40

ansible-playbooks

Rancher 1.6 Installation. Doesn't support Rancher 2.0
Python
113
star
41

sherdock

Docker Image Manager
JavaScript
110
star
42

norman

APIs on APIs on APIs
Go
108
star
43

docker-from-scratch

Tiny Docker in Docker
Go
105
star
44

lb-controller

Load Balancer for Rancher services via ingress controllers backed up by a Load Balancer provider of choice
Go
97
star
45

pipeline

Go
96
star
46

k3k

Kubernetes in Kubernetes
Go
89
star
47

container-crontab

Simple cron runner for containers
Go
88
star
48

backup-restore-operator

Go
88
star
49

terraform-modules

Rancher Terraform Modules
HCL
85
star
50

os2

EXPERIMENTAL: A Rancher and Kubernetes optimized immutable Linux distribution based on openSUSE
Go
82
star
51

system-charts

Mustache
82
star
52

vagrant

Vagrant file to stand up a Local Rancher install with 3 nodes
Shell
79
star
53

rancher-dns

A simple DNS server that returns different answers depending on the IP address of the client making the request
Go
79
star
54

giddyup

Go
78
star
55

kontainer-engine

Provisioning kubernetes cluster at ease
Go
78
star
56

go-rancher

Go language bindings for Rancher API
Go
74
star
57

go-skel

Skeleton for Rancher Go Microservices
Shell
71
star
58

runc-cve

CVE patches for legacy runc packaged with Docker
Dockerfile
69
star
59

terraform-k3s-aws-cluster

HCL
67
star
60

agent

Shell
64
star
61

external-dns

Service updating external DNS with Rancher services records for Rancher 1.6
Go
63
star
62

terraform-provider-rancher2-archive

[Deprecated] Use https://github.com/terraform-providers/terraform-provider-rancher2
Go
62
star
63

kontainer-driver-metadata

This repository is to keep information of k8s versions and their dependencies like k8s components flags and system addons images.
Go
62
star
64

gitjob

Go
59
star
65

types

Rancher API types
Go
59
star
66

rancher.github.io

HTML
58
star
67

ui-driver-skel

Skeleton Rancher UI driver for custom docker-machine drivers
JavaScript
58
star
68

rancher-docs

Rancher Documentation
JavaScript
57
star
69

rke2-charts

Shell
56
star
70

os-services

RancherOS Service Compose Templates
Shell
54
star
71

client-python

A Python client for Rancher APIs
Python
49
star
72

hyperkube

Rancher hyperkube images
44
star
73

rancher-cloud-controller-manager

A kubernetes cloud-controller-manager for the rancher cloud
Go
44
star
74

steve

Kubernetes API Translator
Go
43
star
75

rodeo

Smarty
43
star
76

cis-operator

Go
43
star
77

rancherd

Bootstrap Rancher and k3s/rke2
Go
42
star
78

partner-charts

A catalog based on applications from independent software vendors (ISVs). Most of them are SUSE Partners.
Smarty
42
star
79

10acre-ranch

Build Rancher environment on GCE
Shell
41
star
80

secrets-bridge

Go
40
star
81

terraform-rancher-server

HCL
39
star
82

storage

Rancher specific storage plugins
Shell
39
star
83

k8s-sql

Storage backend for Kubernetes using Go database/sql
Go
37
star
84

lasso

Low level generic controller framework
Go
36
star
85

server-chart

[Deprecated] Helm chart for Rancher server
Shell
36
star
86

os-packer

Shell
36
star
87

pipeline-example-go

Go
36
star
88

cluster-template-examples

35
star
89

system-tools

This repo is for tools helping with various cleanup tasks for rancher projects. Example: rancher installation cleanup
Go
35
star
90

elemental-operator

The Elemental operator is responsible for managing the OS versions and maintaining a machine inventory to assist with edge or baremetal installations.
Go
33
star
91

image-mirror

Shell
31
star
92

rancher-metadata

A simple HTTP server that returns EC2-style metadata information that varies depending on the source IP address making the request.
Go
31
star
93

os-base

Base file system for RancherOS images
Shell
31
star
94

websocket-proxy

Go
29
star
95

rke-tools

Tools container for supporting functions in RKE
Go
29
star
96

gdapi-python

Python Binding to API spec
Python
28
star
97

wins

Windows containers connect to Windows host
Go
28
star
98

api-ui

Embedded UI for any service that implements the Rancher API spec
JavaScript
27
star
99

turtles

Rancher CAPI extension
Go
27
star
100

migration-tools

Go
27
star