• Stars
    star
    560
  • Rank 76,413 (Top 2 %)
  • Language Makefile
  • License
    MIT License
  • Created over 6 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Kubernetes Horizontal Pod Autoscaler with Prometheus custom metrics

k8s-prom-hpa

Autoscaling is an approach to automatically scale up or down workloads based on the resource usage. Autoscaling in Kubernetes has two dimensions: the Cluster Autoscaler that deals with node scaling operations and the Horizontal Pod Autoscaler that automatically scales the number of pods in a deployment or replica set. The Cluster Autoscaling together with Horizontal Pod Autoscaler can be used to dynamically adjust the computing power as well as the level of parallelism that your system needs to meet SLAs. While the Cluster Autoscaler is highly dependent on the underling capabilities of the cloud provider that's hosting your cluster, the HPA can operate independently of your IaaS/PaaS provider.

The Horizontal Pod Autoscaler feature was first introduced in Kubernetes v1.1 and has evolved a lot since then. Version 1 of the HPA scaled pods based on observed CPU utilization and later on based on memory usage. In Kubernetes 1.6 a new API Custom Metrics API was introduced that enables HPA access to arbitrary metrics. And Kubernetes 1.7 introduced the aggregation layer that allows 3rd party applications to extend the Kubernetes API by registering themselves as API add-ons. The Custom Metrics API along with the aggregation layer made it possible for monitoring systems like Prometheus to expose application-specific metrics to the HPA controller.

The Horizontal Pod Autoscaler is implemented as a control loop that periodically queries the Resource Metrics API for core metrics like CPU/memory and the Custom Metrics API for application-specific metrics.

Overview

What follows is a step-by-step guide on configuring HPA v2 for Kubernetes 1.9 or later. You will install the Metrics Server add-on that supplies the core metrics and then you'll use a demo app to showcase pod autoscaling based on CPU and memory usage. In the second part of the guide you will deploy Prometheus and a custom API server. You will register the custom API server with the aggregator layer and then configure HPA with custom metrics supplied by the demo application.

Before you begin you need to install Go 1.8 or later and clone the k8s-prom-hpa repo in your GOPATH:

cd $GOPATH
git clone https://github.com/stefanprodan/k8s-prom-hpa

Setting up the Metrics Server

The Kubernetes Metrics Server is a cluster-wide aggregator of resource usage data and is the successor of Heapster. The metrics server collects CPU and memory usage for nodes and pods by pooling data from the kubernetes.summary_api. The summary API is a memory-efficient API for passing data from Kubelet/cAdvisor to the metrics server.

Metrics-Server

If in the first version of HPA you would need Heapster to provide CPU and memory metrics, in HPA v2 and Kubernetes 1.8 only the metrics server is required with the horizontal-pod-autoscaler-use-rest-clients switched on. The HPA rest client is enabled by default in Kubernetes 1.9. GKE 1.9 comes with the Metrics Server pre-installed.

Deploy the Metrics Server in the kube-system namespace:

kubectl create -f ./metrics-server

After one minute the metric-server starts reporting CPU and memory usage for nodes and pods.

View nodes metrics:

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .

View pods metrics:

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/pods" | jq .

Auto Scaling based on CPU and memory usage

You will use a small Golang-based web app to test the Horizontal Pod Autoscaler (HPA).

Deploy podinfo to the default namespace:

kubectl create -f ./podinfo/podinfo-svc.yaml,./podinfo/podinfo-dep.yaml

Access podinfo with the NodePort service at http://<K8S_PUBLIC_IP>:31198.

Next define a HPA that maintains a minimum of two replicas and scales up to ten if the CPU average is over 80% or if the memory goes over 200Mi:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: podinfo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
        target:
          type: Utilization
          averageUtilization: 80
  - type: Resource
    resource:
      name: memory
        target:
          type: AverageValue
          averageValue: 200Mi

Create the HPA:

kubectl create -f ./podinfo/podinfo-hpa.yaml

After a couple of seconds the HPA controller contacts the metrics server and then fetches the CPU and memory usage:

kubectl get hpa

NAME      REFERENCE            TARGETS                      MINPODS   MAXPODS   REPLICAS   AGE
podinfo   Deployment/podinfo   2826240 / 200Mi, 15% / 80%   2         10        2          5m

In order to increase the CPU usage, run a load test with rakyll/hey:

#install hey
go get -u github.com/rakyll/hey

#do 10K requests
hey -n 10000 -q 10 -c 5 http://<K8S_PUBLIC_IP>:31198/

You can monitor the HPA events with:

$ kubectl describe hpa

Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  7m    horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  3m    horizontal-pod-autoscaler  New size: 8; reason: cpu resource utilization (percentage of request) above target

Remove podinfo for the moment. You will deploy it again later on in this tutorial:

kubectl delete -f ./podinfo/podinfo-hpa.yaml,./podinfo/podinfo-dep.yaml,./podinfo/podinfo-svc.yaml

Setting up a Custom Metrics Server

In order to scale based on custom metrics you need to have two components. One component that collects metrics from your applications and stores them the Prometheus time series database. And a second component that extends the Kubernetes custom metrics API with the metrics supplied by the collect, the k8s-prometheus-adapter.

Custom-Metrics-Server

You will deploy Prometheus and the adapter in a dedicated namespace.

Create the monitoring namespace:

kubectl create -f ./namespaces.yaml

Deploy Prometheus v2 in the monitoring namespace:

If you are deploying to GKE you might get an error saying: Error from server (Forbidden): error when creating This will help you resolve that issue: RBAC on GKE

kubectl create -f ./prometheus

Generate the TLS certificates needed by the Prometheus adapter:

touch metrics-ca.key metrics-ca.crt metrics-ca-config.json
make certs

Deploy the Prometheus custom metrics API adapter:

kubectl create -f ./custom-metrics-api

List the custom metrics provided by Prometheus:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

Get the FS usage for all the pods in the monitoring namespace:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitoring/pods/*/kubelet_container_log_filesystem_used_bytes" | jq .

Auto Scaling based on custom metrics

Create podinfo NodePort service and deployment in the default namespace:

kubectl create -f ./podinfo/podinfo-svc.yaml,./podinfo/podinfo-dep.yaml

The podinfo app exposes a custom metric named http_requests_total. The Prometheus adapter removes the _total suffix and marks the metric as a counter metric.

Get the total requests per second from the custom metrics API:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests" | jq .
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "default",
        "name": "podinfo-6b86c8ccc9-kv5g9",
        "apiVersion": "/__internal"
      },
      "metricName": "http_requests",
      "timestamp": "2018-01-10T16:49:07Z",
      "value": "901m"
    },
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "default",
        "name": "podinfo-6b86c8ccc9-nm7bl",
        "apiVersion": "/__internal"
      },
      "metricName": "http_requests",
      "timestamp": "2018-01-10T16:49:07Z",
      "value": "898m"
    }
  ]
}

The m represents milli-units, so for example, 901m means 901 milli-requests.

Create a HPA that will scale up the podinfo deployment if the number of requests goes over 10 per second:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: podinfo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
        metric:
          name: http_requests
        target:
          type: AverageValue
          averageValue: 10

Deploy the podinfo HPA in the default namespace:

kubectl create -f ./podinfo/podinfo-hpa-custom.yaml

After a couple of seconds the HPA fetches the http_requests value from the metrics API:

kubectl get hpa

NAME      REFERENCE            TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
podinfo   Deployment/podinfo   899m / 10   2         10        2          1m

Apply some load on the podinfo service with 25 requests per second:

#install hey
go get -u github.com/rakyll/hey

#do 10K requests rate limited at 25 QPS
hey -n 10000 -q 5 -c 5 http://<K8S-IP>:31198/healthz

After a few minutes the HPA begins to scale up the deployment:

kubectl describe hpa

Name:                       podinfo
Namespace:                  default
Reference:                  Deployment/podinfo
Metrics:                    ( current / target )
  "http_requests" on pods:  9059m / 10
Min replicas:               2
Max replicas:               10

Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  2m    horizontal-pod-autoscaler  New size: 3; reason: pods metric http_requests above target

At the current rate of requests per second the deployment will never get to the max value of 10 pods. Three replicas are enough to keep the RPS under 10 per each pod.

After the load tests finishes, the HPA down scales the deployment to it's initial replicas:

Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  5m    horizontal-pod-autoscaler  New size: 3; reason: pods metric http_requests above target
  Normal  SuccessfulRescale  21s   horizontal-pod-autoscaler  New size: 2; reason: All metrics below target

You may have noticed that the autoscaler doesn't react immediately to usage spikes. By default the metrics sync happens once every 30 seconds and scaling up/down can only happen if there was no rescaling within the last 3-5 minutes. In this way, the HPA prevents rapid execution of conflicting decisions and gives time for the Cluster Autoscaler to kick in.

Conclusions

Not all systems can meet their SLAs by relying on CPU/memory usage metrics alone, most web and mobile backends require autoscaling based on requests per second to handle any traffic bursts. For ETL apps, auto scaling could be triggered by the job queue length exceeding some threshold and so on. By instrumenting your applications with Prometheus and exposing the right metrics for autoscaling you can fine tune your apps to better handle bursts and ensure high availability.

More Repositories

1

dockprom

Docker hosts and containers monitoring with Prometheus, Grafana, cAdvisor, NodeExporter and AlertManager
5,809
star
2

podinfo

Go microservice template for Kubernetes
Go
5,097
star
3

AspNetCoreRateLimit

ASP.NET Core rate limiting middleware
C#
3,043
star
4

swarmprom

Docker Swarm instrumentation with Prometheus, Grafana, cAdvisor, Node Exporter and Alert Manager
Shell
1,862
star
5

timoni

Timoni is a package manager for Kubernetes, powered by CUE and inspired by Helm.
Go
1,306
star
6

WebApiThrottle

ASP.NET Web API rate limiter for IIS and Owin hosting
C#
1,284
star
7

mgob

MongoDB dockerized backup agent. Runs schedule backups with retention, S3 & SFTP upload, notifications, instrumentation with Prometheus and more.
Go
770
star
8

gitops-istio

A GitOps recipe for Progressive Delivery with Flux v2, Flagger and Istio
644
star
9

kustomizer

An experimental package manager for distributing Kubernetes configuration as OCI artifacts.
Go
279
star
10

MvcThrottle

ASP.NET MVC Throttling filter
C#
226
star
11

istio-gke

Istio service mesh walkthrough (GKE, CloudDNS, Flagger, OpenFaaS)
217
star
12

kube-tools

Kubernetes tools for GitHub Actions CI
Shell
190
star
13

k8s-scw-baremetal

Kubernetes installer for Scaleway bare-metal AMD64 and ARMv7
HCL
177
star
14

flux-local-dev

Flux local dev environment with Docker and Kubernetes KIND
CUE
144
star
15

mongo-swarm

Bootstrapping MongoDB sharded clusters on Docker Swarm
Go
126
star
16

aspnetcore-dockerswarm

ASP.NET Core orchestration scenarios with Docker
C#
119
star
17

istio-hpa

Configure horizontal pod autoscaling with Istio metrics and Prometheus
Dockerfile
106
star
18

helm-gh-pages

A GitHub Action for publishing Helm charts to Github Pages
Shell
102
star
19

flux-aio

Flux All-In-One distribution made with Timoni
CUE
97
star
20

openfaas-flux

OpenFaaS Kubernetes cluster state management with FluxCD
HTML
79
star
21

dockerdash

Docker dashboard built with ASP.NET Core, Docker.DotNet, SignalR and Vuejs
JavaScript
69
star
22

gitops-linkerd

Progressive Delivery workshop with Linkerd, Flagger and Flux
Shell
64
star
23

gitops-helm-workshop

Progressive Delivery for Kubernetes with Flux, Helm, Linkerd and Flagger
Smarty
61
star
24

hrval-action

Flux Helm Release validation GitHub action
Shell
59
star
25

scaleway-swarm-terraform

Setup a Docker Swarm Cluster on Scaleway with Terraform
HCL
46
star
26

dockes

Elasticsearch cluster with Docker
Shell
45
star
27

flagger-appmesh-gateway

A Kubernetes API Gateway for AWS App Mesh powered by Envoy
Go
44
star
28

kjob

Kubernetes job runner
Go
42
star
29

gitops-progressive-delivery

Progressive delivery with Istio, Weave Flux and Flagger
41
star
30

gh-actions-demo

GitOps pipeline with GitHub actions and Weave Cloud
Go
38
star
31

eks-hpa-profile

An eksctl gitops profile for autoscaling with Prometheus metrics on Amazon EKS on AWS Fargate
35
star
32

faas-grafana

OpenFaaS Grafana
Shell
35
star
33

dockelk

ELK log transport and aggregation at scale
Shell
32
star
34

openfaas-gke

Running OpenFaaS on Google Kubernetes Engine
Shell
30
star
35

prometheus.aspnetcore

Prometheus instrumentation for .NET Core
C#
29
star
36

syros

DevOps tool for managing microservices
Go
28
star
37

gh-actions

GitHub actions for Kubernetes and Helm workflows
Dockerfile
27
star
38

gitops-app-distribution

GitOps workflow for managing app delivery on multiple clusters
Shell
22
star
39

gitops-kyverno

Kubernetes policy managed with Flux and Kyverno
Shell
21
star
40

gitops-appmesh

Progressive Delivery on EKS with AppMesh, Flagger and Flux v2
Shell
19
star
41

flux-workshop-2023

Flux Workshop 2023-08-10
18
star
42

dockerd-exporter

Prometheus Docker daemon metrics exporter
Dockerfile
17
star
43

caddy-builder

Build Caddy with plugins as an Ingress/Proxy for OpenFaaS
Go
16
star
44

jenkins

Continuous integration with disposable containers
Shell
16
star
45

swarm-gcp-faas

Setup OpenFaaS on Google Cloud with Terraform, Docker Swarm and Weave
HCL
16
star
46

nexus

A Sonatype Nexus Repository Manager Docker image based on Alpine with OpenJDK 8
16
star
47

podinfo-deploy

A GitOps workflow for multi-env deployments
14
star
48

es-curator-cron

Docker Alpine image with Elasticsearch Curator cron job
Shell
13
star
49

caddy-dockerd

Caddy reverse proxy for Docker Remote API with IP filter
Shell
12
star
50

openfaas-certinfo

OpenFaaS function that returns SSL/TLS certificate information for a given URL
Go
12
star
51

eks-contour-ingress

Securing EKS Ingress with Contour and Let's Encrypt the GitOps way
Shell
11
star
52

gomicro

golang microservice prototype
Go
10
star
53

ngrok

ngrok docker image
10
star
54

rancher-swarm-weave

Rancher + Docker Swarm + Weave Cloud Scope integration
HCL
9
star
55

kubernetes-cue-schema

CUE schema of the Kubernetes API
CUE
9
star
56

swarm-logspout-logstash

Logspout adapter to send Docker Swarm logs to Logstash
Go
9
star
57

openfaas-promq

OpenFaaS function that executes Prometheus queries
Go
8
star
58

appmesh-eks

AWS App Mesh installer for EKS
Smarty
6
star
59

fninfo

OpenFaaS Kubernetes info function
Go
5
star
60

alpine-base

Alpine Linux base image
Dockerfile
4
star
61

gloo-flagger-demo

GitOps Progressive Delivery demo with Gloo, Flagger and Flux
4
star
62

k8s-grafana

Kubernetes Grafana v5.0 dashboards
Smarty
4
star
63

stefanprodan

My open source portfolio and tech blog
4
star
64

appmesh-dev

Testing eks-appmesh-profile
Shell
3
star
65

AndroidDevLab

Android developer laboratory setup
Batchfile
2
star
66

my-k8s-fleet

2
star
67

RequireJsNet.Samples

RequireJS.NET samples
JavaScript
2
star
68

BFormsTemplates

BForms Visual Studio Project Template
JavaScript
2
star
69

EsnServiceBus

Service Bus and Service Registry implementation based on RabbitMQ
C#
2
star
70

klog

Go
2
star
71

homebrew-tap

Homebrew formulas
Ruby
1
star
72

goc-proxy

A dynamic reverse proxy backed by Consul
Go
1
star
73

evomon

Go
1
star
74

loadtest

Hey load test container
1
star
75

openfaas-colorisebot-gke-weave

OpenFaaS colorisebot on GKE and Weave Cloud
Shell
1
star
76

xmicro

microservice HA prototype
Go
1
star