• Stars
    star
    519
  • Rank 85,261 (Top 2 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Kubernetes Cluster Proportional Autoscaler Container

Horizontal cluster-proportional-autoscaler container

Build Status Go Report Card

Overview

This container image watches over the number of schedulable nodes and cores of the cluster and resizes the number of replicas for the required resource. This functionality may be desirable for applications that need to be autoscaled with the size of the cluster, such as DNS and other services that scale with the number of nodes/pods in the cluster.

Usage of cluster-proportional-autoscaler:

      --alsologtostderr[=false]: log to standard error as well as files
      --configmap="": ConfigMap containing our scaling parameters.
      --default-params=map[]: Default parameters(JSON format) for auto-scaling. Will create/re-create a ConfigMap with this default params if ConfigMap is not present.
      --log-backtrace-at=:0: when logging hits line file:N, emit a stack trace
      --log-dir="": If non-empty, write log files in this directory
      --logtostderr[=false]: log to standard error instead of files
      --namespace="": Namespace for all operations, fallback to the namespace of this autoscaler(through MY_POD_NAMESPACE env) if not specified.
      --poll-period-seconds=10: The time, in seconds, to check cluster status and perform autoscale.
      --stderrthreshold=2: logs at or above this threshold go to stderr
      --target="": Target to scale. In format: deployment/*, replicationcontroller/* or replicaset/* (not case sensitive).
      --v=0: log level for V logs
      --version[=false]: Print the version and exit.
      --vmodule=: comma-separated list of pattern=N settings for file-filtered logging
      --nodelabels=: NodeLabels for filtering search of nodes and its cpus by LabelSelectors. Input format is a comma separated list of keyN=valueN LabelSelectors. Usage example: --nodelabels=label1=value1,label2=value2.
      --max-sync-failures=[0]: Number of consecutive polling failures before exiting. Default value of 0 will allow for unlimited retries.

Installation with helm

Add the cluster-proportional-autoscaler Helm repository:

helm repo add cluster-proportional-autoscaler https://kubernetes-sigs.github.io/cluster-proportional-autoscaler
helm repo update

Then install a release using the chart. The charts default values file provides some commented out examples for setting some of the values. There are several required values, but helm should fail with messages that indicate which value is missing.

helm upgrade --install cluster-proportional-autoscaler \
    cluster-proportional-autoscaler/cluster-proportional-autoscaler --values <<name_of_your_values_file>>.yaml

Examples

Please try out the examples in the examples folder.

Implementation Details

The code in this module is a Kubernetes Golang API client that, using the default service account credentials available to Golang clients running inside pods, it connects to the API server and polls for the number of nodes and cores in the cluster.

The scaling parameters and data points are provided via a ConfigMap to the autoscaler and it refreshes its parameters table every poll interval to be up to date with the latest desired scaling parameters.

Calculation of number of replicas

The desired number of replicas is computed by using the number of cores and nodes as input of the chosen controller.

This may be later extended to more complex interpolation or exponential scaling schemes but it currently supports linear and ladder modes.

Control patterns and ConfigMap formats

The ConfigMap provides the configuration parameters, allowing on-the-fly changes(including control mode) without rebuilding or restarting the scaler containers/pods.

Currently the two supported ConfigMap key value is: ladder and linear, which corresponding to two supported control mode.

Linear Mode

Parameters in ConfigMap must be JSON and use linear as key. The sub-keys as below indicates:

data:
  linear: |-
    {
      "coresPerReplica": 2,
      "nodesPerReplica": 1,
      "min": 1,
      "max": 100,
      "preventSinglePointFailure": true,
      "includeUnschedulableNodes": true
    }

The equation of linear control mode as below:

replicas = max( ceil( cores * 1/coresPerReplica ) , ceil( nodes * 1/nodesPerReplica ) )
replicas = min(replicas, max)
replicas = max(replicas, min)

When preventSinglePointFailure is set to true, controller ensures at least 2 replicas if there are more than one node.

For instance, given a cluster has 4 nodes and 13 cores. With above parameters, each replica could take care of 1 node. So we need 4 / 1 = 4 replicas to take care of all 4 nodes. And each replica could take care of 2 cores. We need ceil(13 / 2) = 7 replicas to take care of all 13 cores. Controller will choose the greater one, which is 7 here, as the result.

When includeUnschedulableNodes is set to true, the replicas will scale based on the total number of nodes. Otherwise, the replicas will only scale based on the number of schedulable nodes (i.e., cordoned and draining nodes are excluded.)

Either one of the coresPerReplica or nodesPerReplica could be omitted. All of min, max, preventSinglePointFailure and includeUnscheduleableNodes are optional. If not set, min would be default to 1, preventSinglePointFailure will be default to false and includeUnschedulableNodes will be default to false.

Side notes:

  • Both coresPerReplica and nodesPerReplica are float.
  • The lowest replicas will be set to 1 when min is less than 1.

Ladder Mode

Parameters in ConfigMap must be JSON and use ladder as key. The sub-keys as below indicates:

data:
  ladder: |-
    {
      "coresToReplicas":
      [
        [ 1, 1 ],
        [ 64, 3 ],
        [ 512, 5 ],
        [ 1024, 7 ],
        [ 2048, 10 ],
        [ 4096, 15 ]
      ],
      "nodesToReplicas":
      [
        [ 1, 1 ],
        [ 2, 2 ]
      ]
    }

The ladder controller gives out the desired replicas count by using a step function. The step ladder function uses the datapoint for core and node scaling from the ConfigMap. The lookup which yields the higher number of replicas will be used as the target scaling number.

For instance, given a cluster comes with 100 nodes and 400 cores and it is using above ConfigMap. The replicas derived from "cores_to_replicas_map" would be 3 (because 64 < 400 < 512). The replicas derived from "nodes_to_replicas_map" would be 2 (because 100 > 2). And we would choose the larger one 3.

Either one of the coresToReplicas or nodesToReplicas could be omitted. All elements in them should be int.

Replicas can be set to 0 (unlike in linear mode).

Scaling to 0 replicas could be used to enable optional features as a cluster grows. For example, this ladder would create a single replica once the cluster reaches six nodes.

data:
  ladder: |-
    {
      "nodesToReplicas":
      [
        [ 0, 0 ],
        [ 6, 1 ]
      ]
    }

Comparisons to the Horizontal Pod Autoscaler feature

The Horizontal Pod Autoscaler is a top-level Kubernetes API resource. It is a closed feedback loop autoscaler which monitors CPU utilization of the pods and scales the number of replicas automatically. It requires the CPU resources to be defined for all containers in the target pods and also requires heapster to be running to provide CPU utilization metrics.

This horizontal cluster proportional autoscaler is a DIY container (because it is not a Kubernetes API resource) that provides a simple control loop that watches the cluster size and scales the target controller. The actual CPU or memory utilization of the target controller pods is not an input to the control loop, the sole inputs are number of schedulable cores and nodes in the cluster. There is no requirement to run heapster and/or provide CPU resource limits as in HPAs.

The ConfigMap provides the operator with the ability to tune the replica scaling explicitly.

Using NodeLabels

Nodelabels is an optional param to count only nodes and its cpus where the nodelabels exits. This is useful when nodeselector is used on the target pods controller so its needed to take account only the nodes tagged with the nodeselector labels to calculate the total replicas to scale. When the param is ignored then the cluster proportional autoscaler counts all schedulable nodes and its cpus.

More Repositories

1

kubespray

Deploy a Production Ready Kubernetes Cluster
Jinja
14,679
star
2

kind

Kubernetes IN Docker - local clusters for testing Kubernetes
Go
13,222
star
3

kustomize

Customization of kubernetes YAML configurations
Go
10,363
star
4

kubebuilder

Kubebuilder - SDK for building Kubernetes APIs using CRDs
Go
7,716
star
5

external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Go
6,672
star
6

krew

πŸ“¦ Find and install kubectl plugins
Go
6,132
star
7

metrics-server

Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
Go
4,761
star
8

aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
Go
3,921
star
9

descheduler

Descheduler for Kubernetes
Go
3,444
star
10

cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
Go
2,944
star
11

kui

A hybrid command-line/UI development experience for cloud-native development
TypeScript
2,746
star
12

nfs-subdir-external-provisioner

Dynamic sub-dir volume provisioner on a remote NFS server.
Shell
2,378
star
13

kwok

Kubernetes WithOut Kubelet - Simulates thousands of Nodes and Clusters.
Go
2,304
star
14

controller-runtime

Repo for the controller-runtime subproject of kubebuilder (sig-apimachinery)
Go
2,240
star
15

aws-iam-authenticator

A tool to use AWS IAM credentials to authenticate to a Kubernetes cluster
Go
2,008
star
16

prometheus-adapter

An implementation of the custom.metrics.k8s.io API using Prometheus
Go
1,662
star
17

gateway-api

Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
Go
1,582
star
18

cri-tools

CLI and validation tools for Kubelet Container Runtime Interface (CRI) .
Go
1,333
star
19

secrets-store-csi-driver

Secrets Store CSI driver for Kubernetes secrets - Integrates secrets stores with Kubernetes via a CSI volume.
Go
1,177
star
20

kueue

Kubernetes-native Job Queueing
Go
1,144
star
21

scheduler-plugins

Repository for out-of-tree scheduler plugins based on scheduler framework.
Go
1,015
star
22

sig-storage-local-static-provisioner

Static provisioner of local volumes
Go
1,009
star
23

aws-ebs-csi-driver

CSI driver for Amazon EBS https://aws.amazon.com/ebs/
Go
923
star
24

apiserver-builder-alpha

apiserver-builder-alpha implements libraries and tools to quickly and easily build Kubernetes apiservers/controllers to support custom resource types based on APIServer Aggregation
Go
787
star
25

etcdadm

Go
758
star
26

kube-scheduler-simulator

The simulator for the Kubernetes scheduler
Go
715
star
27

aws-efs-csi-driver

CSI Driver for Amazon EFS https://aws.amazon.com/efs/
Go
683
star
28

controller-tools

Tools to use with the controller-runtime libraries
Go
682
star
29

security-profiles-operator

The Kubernetes Security Profiles Operator
C
649
star
30

krew-index

Plugin index for https://github.com/kubernetes-sigs/krew. This repo is for plugin maintainers.
628
star
31

cluster-api-provider-aws

Kubernetes Cluster API Provider AWS provides consistent deployment and day 2 operations of "self-managed" and EKS Kubernetes clusters on AWS.
Go
618
star
32

node-feature-discovery

Node feature discovery for Kubernetes
Go
595
star
33

hierarchical-namespaces

Home of the Hierarchical Namespace Controller (HNC). Adds hierarchical policies and delegated creation to Kubernetes namespaces for improved in-cluster multitenancy.
Go
583
star
34

sig-storage-lib-external-provisioner

Go
515
star
35

alibaba-cloud-csi-driver

CSI Plugin for Kubernetes, Support Alibaba Cloud EBS/NAS/OSS/CPFS
Go
511
star
36

application

Application metadata descriptor CRD
Go
488
star
37

custom-metrics-apiserver

Framework for implementing custom metrics support for Kubernetes
Go
457
star
38

e2e-framework

A Go framework for end-to-end testing of components running in Kubernetes clusters.
Go
439
star
39

nfs-ganesha-server-and-external-provisioner

NFS Ganesha Server and Volume Provisioner.
Shell
399
star
40

cluster-capacity

Cluster capacity analysis
Go
390
star
41

karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Go
356
star
42

cluster-api-provider-vsphere

Go
349
star
43

apiserver-network-proxy

Go
349
star
44

image-builder

Tools for building Kubernetes disk images
Shell
344
star
45

kubetest2

Kubetest2 is the framework for launching and running end-to-end tests on Kubernetes.
Go
323
star
46

ingress2gateway

Convert Ingress resources to Gateway API resources
Go
301
star
47

bom

A utility to generate SPDX-compliant Bill of Materials manifests
Go
300
star
48

cluster-api-provider-nested

Cluster API Provider for Nested Clusters
Go
294
star
49

vsphere-csi-driver

vSphere storage Container Storage Interface (CSI) plugin
Go
289
star
50

cluster-api-provider-azure

Cluster API implementation for Microsoft Azure
Go
283
star
51

blixt

Layer 4 Kubernetes load-balancer
Rust
268
star
52

cluster-api-provider-openstack

Go
266
star
53

kubebuilder-declarative-pattern

A toolkit for building declarative operators with kubebuilder
Go
248
star
54

kpng

Reworking kube-proxy's architecture
Go
240
star
55

cloud-provider-azure

Cloud provider for Azure
Go
222
star
56

aws-encryption-provider

APIServer encryption provider, backed by AWS KMS
Go
192
star
57

mcs-api

This repository hosts the Multi-Cluster Service APIs. Providers can import packages in this repo to ensure their multi-cluster service controller implementations will be compatible with MCS data planes.
Go
187
star
58

ip-masq-agent

Manage IP masquerade on nodes
Go
180
star
59

zeitgeist

Zeitgeist: the language-agnostic dependency checker
Go
171
star
60

contributor-playground

Dockerfile
171
star
61

cluster-api-provider-gcp

The GCP provider implementation for Cluster API
Go
168
star
62

cluster-addons

Addon operators for Kubernetes clusters.
Go
156
star
63

azurefile-csi-driver

Azure File CSI Driver
Go
155
star
64

gcp-compute-persistent-disk-csi-driver

The Google Compute Engine Persistent Disk (GCE PD) Container Storage Interface (CSI) Storage Plugin.
Go
151
star
65

cli-utils

This repo contains binaries that built from libraries in cli-runtime.
Go
147
star
66

azuredisk-csi-driver

Azure Disk CSI Driver
Go
145
star
67

promo-tools

Container and file artifact promotion tooling for the Kubernetes project
Go
138
star
68

cluster-api-operator

Home for Cluster API Operator, a subproject of sig-cluster-lifecycle
Go
134
star
69

kube-storage-version-migrator

Go
125
star
70

lws

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
Go
124
star
71

blob-csi-driver

Azure Blob Storage CSI driver
Go
123
star
72

aws-fsx-csi-driver

CSI Driver of Amazon FSx for Lustre https://aws.amazon.com/fsx/lustre/
Go
118
star
73

usage-metrics-collector

High fidelity and scalable capacity and usage metrics for Kubernetes clusters
Go
117
star
74

boskos

Boskos is a resource management service that provides reservation and lifecycle management of a variety of different kinds of resources.
Go
117
star
75

sig-windows-tools

Repository for tools and artifacts related to the sig-windows charter in Kubernetes. Scripts to assist kubeadm and wincat and flannel will be hosted here.
PowerShell
117
star
76

downloadkubernetes

Download kubernetes binaries more easily
Go
115
star
77

cluster-api-provider-digitalocean

The DigitalOcean provider implementation of the Cluster Management API
Go
108
star
78

cluster-api-provider-kubevirt

Cluster API Provider for KubeVirt
Go
103
star
79

kubectl-validate

Go
103
star
80

jobset

JobSet: An API for managing a group of Jobs as a unit
Go
97
star
81

cluster-api-provider-packet

Cluster API Provider Packet (now Equinix Metal)
Go
94
star
82

structured-merge-diff

Test cases and implementation for "server-side apply"
Go
92
star
83

slack-infra

Tooling for kubernetes.slack.com
Go
90
star
84

cluster-api-addon-provider-helm

Cluster API Add-on Provider for Helm is a extends the functionality of Cluster API by providing a solution for managing the installation, configuration, upgrade, and deletion of Cluster add-ons using Helm charts.
Go
85
star
85

dashboard-metrics-scraper

Container to scrape, store, and retrieve a window of time from the Metrics Server.
Go
84
star
86

apiserver-runtime

Libraries for implementing aggregated apiservers
Go
83
star
87

kube-scheduler-wasm-extension

All the things to make the scheduler extendable with wasm.
Go
83
star
88

container-object-storage-interface-controller

Container Object Storage Interface (COSI) controller responsible to manage lifecycle of COSI objects.
Go
83
star
89

cli-experimental

Experimental Kubectl libraries and commands.
Go
82
star
90

gcp-filestore-csi-driver

The Google Cloud Filestore Container Storage Interface (CSI) Plugin.
Go
82
star
91

lwkd

Last Week in Kubernetes Development
HTML
78
star
92

sig-windows-dev-tools

This is a batteries included local development environment for Kubernetes on Windows.
PowerShell
77
star
93

cloud-provider-kind

Cloud provider for KIND clusters
Go
75
star
94

kernel-module-management

The kernel module management operator builds, signs and loads kernel modules in Kubernetes clusters.
Go
75
star
95

cloud-provider-equinix-metal

Kubernetes Cloud Provider for Equinix Metal (formerly Packet Cloud Controller Manager)
Go
71
star
96

reference-docs

Tools to build reference documentation for Kubernetes APIs and CLIs.
HTML
69
star
97

hydrophone

Hydrophone is a lightweight Kubernetes conformance tests runner
Go
63
star
98

community-images

kubectl plugin that displays images running in a Kubernetes cluster that were pulled from community owned repositories and warn the user to switch repositories if needed
Go
61
star
99

wg-policy-prototypes

A place for policy work group related proposals and prototypes.
Go
60
star
100

cluster-api-ipam-provider-in-cluster

An IPAM provider for Cluster API that manages pools of IP addresses using Kubernetes resources.
Go
59
star