• Stars
    star
    5,313
  • Rank 7,774 (Top 0.2 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Add-on agent to generate and expose cluster-level metrics.

Overview

Build Status Go Report Card Go Reference govulncheck

kube-state-metrics (KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. (See examples in the Metrics section below.) It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods.

kube-state-metrics is about generating metrics from Kubernetes API objects without modification. This ensures that features provided by kube-state-metrics have the same grade of stability as the Kubernetes API objects themselves. In turn, this means that kube-state-metrics in certain situations may not show the exact same values as kubectl, as kubectl applies certain heuristics to display comprehensible messages. kube-state-metrics exposes raw data unmodified from the Kubernetes API, this way users have all the data they require and perform heuristics as they see fit.

The metrics are exported on the HTTP endpoint /metrics on the listening port (default 8080). They are served as plaintext. They are designed to be consumed either by Prometheus itself or by a scraper that is compatible with scraping a Prometheus client endpoint. You can also open /metrics in a browser to see the raw metrics. Note that the metrics exposed on the /metrics endpoint reflect the current state of the Kubernetes cluster. When Kubernetes objects are deleted they are no longer visible on the /metrics endpoint.

Table of Contents

Versioning

Kubernetes Version

kube-state-metrics uses client-go to talk with Kubernetes clusters. The supported Kubernetes cluster version is determined by client-go. The compatibility matrix for client-go and Kubernetes cluster can be found here. All additional compatibility is only best effort, or happens to still/already be supported.

Compatibility matrix

At most, 5 kube-state-metrics and 5 kubernetes releases will be recorded below. Generally, it is recommended to use the latest release of kube-state-metrics. If you run a very recent version of Kubernetes, you might want to use an unreleased version to have the full range of supported resources. If you run an older version of Kubernetes, you might need to run an older version in order to have full support for all resources. Be aware, that the maintainers will only support the latest release. Older versions might be supported by interested users of the community.

kube-state-metrics Kubernetes client-go Version
v2.5.0 v1.24
v2.6.0 v1.24
v2.7.0 v1.25
v2.8.2 v1.26
v2.9.2 v1.26
main v1.27

Resource group version compatibility

Resources in Kubernetes can evolve, i.e., the group version for a resource may change from alpha to beta and finally GA in different Kubernetes versions. For now, kube-state-metrics will only use the oldest API available in the latest release.

Container Image

The latest container image can be found at:

  • registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.9.2 (arch: amd64, arm, arm64, ppc64le and s390x)
  • View all multi-architecture images at here

Metrics Documentation

Any resources and metrics based on alpha Kubernetes APIs are excluded from any stability guarantee, which may be changed at any given release.

See the docs directory for more information on the exposed metrics.

Conflict resolution in label names

The *_labels family of metrics exposes Kubernetes labels as Prometheus labels. As Kubernetes is more liberal than Prometheus in terms of allowed characters in label names, we automatically convert unsupported characters to underscores. For example, app.kubernetes.io/name becomes label_app_kubernetes_io_name.

This conversion can create conflicts when multiple Kubernetes labels like foo-bar and foo_bar would be converted to the same Prometheus label label_foo_bar.

Kube-state-metrics automatically adds a suffix _conflictN to resolve this conflict, so it converts the above labels to label_foo_bar_conflict1 and label_foo_bar_conflict2.

If you'd like to have more control over how this conflict is resolved, you might want to consider addressing this issue on a different level of the stack, e.g. by standardizing Kubernetes labels using an Admission Webhook that ensures that there are no possible conflicts.

Kube-state-metrics self metrics

kube-state-metrics exposes its own general process metrics under --telemetry-host and --telemetry-port (default 8081).

kube-state-metrics also exposes list and watch success and error metrics. These can be used to calculate the error rate of list or watch resources. If you encounter those errors in the metrics, it is most likely a configuration or permission issue, and the next thing to investigate would be looking at the logs of kube-state-metrics.

Example of the above mentioned metrics:

kube_state_metrics_list_total{resource="*v1.Node",result="success"} 1
kube_state_metrics_list_total{resource="*v1.Node",result="error"} 52
kube_state_metrics_watch_total{resource="*v1beta1.Ingress",result="success"} 1

kube-state-metrics also exposes some http request metrics, examples of those are:

http_request_duration_seconds_bucket{handler="metrics",method="get",le="2.5"} 30
http_request_duration_seconds_bucket{handler="metrics",method="get",le="5"} 30
http_request_duration_seconds_bucket{handler="metrics",method="get",le="10"} 30
http_request_duration_seconds_bucket{handler="metrics",method="get",le="+Inf"} 30
http_request_duration_seconds_sum{handler="metrics",method="get"} 0.021113919999999998
http_request_duration_seconds_count{handler="metrics",method="get"} 30

kube-state-metrics also exposes build and configuration metrics:

kube_state_metrics_build_info{branch="main",goversion="go1.15.3",revision="6c9d775d",version="v2.0.0-beta"} 1
kube_state_metrics_shard_ordinal{shard_ordinal="0"} 0
kube_state_metrics_total_shards 1

kube_state_metrics_build_info is used to expose version and other build information. For more usage about the info pattern, please check the blog post here. Sharding metrics expose --shard and --total-shards flags and can be used to validate run-time configuration, see /examples/prometheus-alerting-rules.

kube-state-metrics also exposes metrics about it config file and the Custom Resource State config file:

kube_state_metrics_config_hash{filename="crs.yml",type="customresourceconfig"} 2.38272279311849e+14
kube_state_metrics_config_hash{filename="config.yml",type="config"} 2.65285922340846e+14
kube_state_metrics_last_config_reload_success_timestamp_seconds{filename="crs.yml",type="customresourceconfig"} 1.6704882592037103e+09
kube_state_metrics_last_config_reload_success_timestamp_seconds{filename="config.yml",type="config"} 1.6704882592035313e+09
kube_state_metrics_last_config_reload_successful{filename="crs.yml",type="customresourceconfig"} 1
kube_state_metrics_last_config_reload_successful{filename="config.yml",type="config"} 1

Scaling kube-state-metrics

Resource recommendation

Resource usage for kube-state-metrics changes with the Kubernetes objects (Pods/Nodes/Deployments/Secrets etc.) size of the cluster. To some extent, the Kubernetes objects in a cluster are in direct proportion to the node number of the cluster.

As a general rule, you should allocate:

  • 250MiB memory
  • 0.1 cores

Note that if CPU limits are set too low, kube-state-metrics' internal queues will not be able to be worked off quickly enough, resulting in increased memory consumption as the queue length grows. If you experience problems resulting from high memory allocation or CPU throttling, try increasing the CPU limits.

Latency

In a 100 node cluster scaling test the latency numbers were as follows:

"Perc50": 259615384 ns,
"Perc90": 475000000 ns,
"Perc99": 906666666 ns.

A note on costing

By default, kube-state-metrics exposes several metrics for events across your cluster. If you have a large number of frequently-updating resources on your cluster, you may find that a lot of data is ingested into these metrics. This can incur high costs on some cloud providers. Please take a moment to configure what metrics you'd like to expose, as well as consult the documentation for your Kubernetes environment in order to avoid unexpectedly high costs.

kube-state-metrics vs. metrics-server

The metrics-server is a project that has been inspired by Heapster and is implemented to serve the goals of core metrics pipelines in Kubernetes monitoring architecture. It is a cluster level component which periodically scrapes metrics from all Kubernetes nodes served by Kubelet through Metrics API. The metrics are aggregated, stored in memory and served in Metrics API format. The metrics-server stores the latest values only and is not responsible for forwarding metrics to third-party destinations.

kube-state-metrics is focused on generating completely new metrics from Kubernetes' object state (e.g. metrics based on deployments, replica sets, etc.). It holds an entire snapshot of Kubernetes state in memory and continuously generates new metrics based off of it. And just like the metrics-server it too is not responsible for exporting its metrics anywhere.

Having kube-state-metrics as a separate project also enables access to these metrics from monitoring systems such as Prometheus.

Horizontal sharding

In order to shard kube-state-metrics horizontally, some automated sharding capabilities have been implemented. It is configured with the following flags:

  • --shard (zero indexed)
  • --total-shards

Sharding is done by taking an md5 sum of the Kubernetes Object's UID and performing a modulo operation on it with the total number of shards. Each shard decides whether the object is handled by the respective instance of kube-state-metrics or not. Note that this means all instances of kube-state-metrics, even if sharded, will have the network traffic and the resource consumption for unmarshaling objects for all objects, not just the ones they are responsible for. To optimize this further, the Kubernetes API would need to support sharded list/watch capabilities. In the optimal case, memory consumption for each shard will be 1/n compared to an unsharded setup. Typically, kube-state-metrics needs to be memory and latency optimized in order for it to return its metrics rather quickly to Prometheus. One way to reduce the latency between kube-state-metrics and the kube-apiserver is to run KSM with the --use-apiserver-cache flag. In addition to reducing the latency, this option will also lead to a reduction in the load on etcd.

Sharding should be used carefully and additional monitoring should be set up in order to ensure that sharding is set up and functioning as expected (eg. instances for each shard out of the total shards are configured).

Automated sharding

Automatic sharding allows each shard to discover its nominal position when deployed in a StatefulSet which is useful for automatically configuring sharding. This is an experimental feature and may be broken or removed without notice.

To enable automated sharding, kube-state-metrics must be run by a StatefulSet and the pod name and namespace must be handed to the kube-state-metrics process via the --pod and --pod-namespace flags. Example manifests demonstrating the autosharding functionality can be found in /examples/autosharding.

This way of deploying shards is useful when you want to manage KSM shards through a single Kubernetes resource (a single StatefulSet in this case) instead of having one Deployment per shard. The advantage can be especially significant when deploying a high number of shards.

The downside of using an auto-sharded setup comes from the rollout strategy supported by StatefulSets. When managed by a StatefulSet, pods are replaced one at a time with each pod first getting terminated and then recreated. Besides such rollouts being slower, they will also lead to short downtime for each shard. If a Prometheus scrape happens during a rollout, it can miss some of the metrics exported by kube-state-metrics.

Daemonset sharding for pod metrics

For pod metrics, they can be sharded per node with the following flag:

  • --node

Each kube-state-metrics pod uses FieldSelector (spec.nodeName) to watch/list pod metrics only on the same node.

A daemonset kube-state-metrics example:

apiVersion: apps/v1
kind: DaemonSet
spec:
  template:
    spec:
      containers:
      - image: registry.k8s.io/kube-state-metrics/kube-state-metrics:IMAGE_TAG
        name: kube-state-metrics
        args:
        - --resource=pods
        - --node=$(NODE_NAME)
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName

Other metrics can be sharded via Horizontal sharding.

Setup

Install this project to your $GOPATH using go get:

go get k8s.io/kube-state-metrics

Building the Docker container

Simply run the following command in this root folder, which will create a self-contained, statically-linked binary and build a Docker image:

make container

Usage

Simply build and run kube-state-metrics inside a Kubernetes pod which has a service account token that has read-only access to the Kubernetes cluster.

For users of prometheus-operator/kube-prometheus stack

The (kube-prometheus) stack installs kube-state-metrics as one of its components; you do not need to install kube-state-metrics if you're using the kube-prometheus stack.

If you want to revise the default configuration for kube-prometheus, for example to enable non-default metrics, have a look at Customizing Kube-Prometheus.

Kubernetes Deployment

To deploy this project, you can simply run kubectl apply -f examples/standard and a Kubernetes service and deployment will be created. (Note: Adjust the apiVersion of some resource if your kubernetes cluster's version is not 1.8+, check the yaml file for more information).

To have Prometheus discover kube-state-metrics instances it is advised to create a specific Prometheus scrape config for kube-state-metrics that picks up both metrics endpoints. Annotation based discovery is discouraged as only one of the endpoints would be able to be selected, plus kube-state-metrics in most cases has special authentication and authorization requirements as it essentially grants read access through the metrics endpoint to most information available to it.

Note: Google Kubernetes Engine (GKE) Users - GKE has strict role permissions that will prevent the kube-state-metrics roles and role bindings from being created. To work around this, you can give your GCP identity the cluster-admin role by running the following one-liner:

kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=$(gcloud info --format='value(config.account)')

Note that your GCP identity is case sensitive but gcloud info as of Google Cloud SDK 221.0.0 is not. This means that if your IAM member contains capital letters, the above one-liner may not work for you. If you have 403 forbidden responses after running the above command and kubectl apply -f examples/standard, check the IAM member associated with your account at https://console.cloud.google.com/iam-admin/iam?project=PROJECT_ID. If it contains capital letters, you may need to set the --user flag in the command above to the case-sensitive role listed at https://console.cloud.google.com/iam-admin/iam?project=PROJECT_ID.

After running the above, if you see Clusterrolebinding "cluster-admin-binding" created, then you are able to continue with the setup of this service.

Limited privileges environment

If you want to run kube-state-metrics in an environment where you don't have cluster-reader role, you can:

  • create a serviceaccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-state-metrics
  namespace: your-namespace-where-kube-state-metrics-will-deployed
  • give it view privileges on specific namespaces (using roleBinding) (note: you can add this roleBinding to all the NS you want your serviceaccount to access)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kube-state-metrics
  namespace: project1
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: view
subjects:
  - kind: ServiceAccount
    name: kube-state-metrics
    namespace: your-namespace-where-kube-state-metrics-will-deployed
  • then specify a set of namespaces (using the --namespaces option) and a set of kubernetes objects (using the --resources) that your serviceaccount has access to in the kube-state-metrics deployment configuration
spec:
  template:
    spec:
      containers:
      - name: kube-state-metrics
        args:
          - '--resources=pods'
          - '--namespaces=project1'

For the full list of arguments available, see the documentation in docs/cli-arguments.md

Helm Chart

Starting from the kube-state-metrics chart v2.13.3 (kube-state-metrics image v1.9.8), the official Helm chart is maintained in prometheus-community/helm-charts. Starting from kube-state-metrics chart v3.0.0 only kube-state-metrics images of v2.0.0 + are supported.

Development

When developing, test a metric dump against your local Kubernetes cluster by running:

Users can override the apiserver address in KUBE-CONFIG file with --apiserver command line.

go install
kube-state-metrics --port=8080 --telemetry-port=8081 --kubeconfig=<KUBE-CONFIG> --apiserver=<APISERVER>

Then curl the metrics endpoint

curl localhost:8080/metrics

To run the e2e tests locally see the documentation in tests/README.md.

Developer Contributions

When developing, there are certain code patterns to follow to better your contributing experience and likelihood of e2e and other ci tests to pass. To learn more about them, see the documentation in docs/developer/guide.md.

More Repositories

1

kubernetes

Production-Grade Container Scheduling and Management
Go
109,583
star
2

minikube

Run Kubernetes locally
Go
29,215
star
3

ingress-nginx

Ingress-NGINX Controller for Kubernetes
Go
17,204
star
4

kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
Go
15,806
star
5

dashboard

General-purpose web UI for Kubernetes clusters
Go
14,250
star
6

community

Kubernetes community content
Jupyter Notebook
11,899
star
7

kompose

Convert Compose to Kubernetes
Go
9,453
star
8

client-go

Go client for Kubernetes.
Go
8,908
star
9

autoscaler

Autoscaling components for Kubernetes
Go
8,043
star
10

examples

Kubernetes application example tutorials
Shell
6,148
star
11

website

Kubernetes website and documentation repo:
HTML
4,437
star
12

test-infra

Test infrastructure for the Kubernetes project.
Go
3,817
star
13

kubeadm

Aggregator for issues filed against kubeadm
Go
3,728
star
14

enhancements

Enhancements tracking repo for Kubernetes
Go
3,380
star
15

sample-controller

Repository for sample controller. Complements sample-apiserver
Go
3,129
star
16

node-problem-detector

This is a place for various problem detectors running on the Kubernetes nodes.
Go
2,892
star
17

kubectl

Issue tracker and mirror of kubectl code
Go
2,811
star
18

git-sync

A sidecar app which clones a git repo and keeps it in sync with the upstream.
Shell
2,209
star
19

code-generator

Generators for kube-like API types
Go
1,692
star
20

ingress-gce

Ingress controller for Google Cloud
Go
1,269
star
21

dns

Kubernetes DNS service
Go
911
star
22

perf-tests

Performance tests and benchmarks
Go
883
star
23

apimachinery

Go
817
star
24

k8s.io

Code and configuration to manage Kubernetes project infrastructure, including various *.k8s.io sites
HCL
701
star
25

api

The canonical location of the Kubernetes API definition.
Go
647
star
26

apiserver

Library for writing a Kubernetes-style API server.
Go
644
star
27

cloud-provider-openstack

Go
612
star
28

gengo

gengo library for code generation.
Go
548
star
29

sig-release

Repo for SIG release
Shell
534
star
30

sample-apiserver

Reference implementation of an apiserver for a custom Kubernetes API.
Go
527
star
31

metrics

Kubernetes metrics-related API types and clients
Go
489
star
32

release

Release infrastructure for Kubernetes and related components
Go
484
star
33

design-proposals-archive

Archive of Kubernetes Design Proposals
Makefile
478
star
34

registry.k8s.io

This project is the repo for registry.k8s.io, the production OCI registry service for Kubernetes' container image artifacts
Go
385
star
35

cloud-provider-aws

Cloud provider for AWS
Go
382
star
36

cri-api

Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes.
Go
376
star
37

cloud-provider-alibaba-cloud

CloudProvider for Alibaba Cloud
Go
358
star
38

utils

Non-Kubernetes-specific utility libraries which are consumed by multiple projects.
Go
326
star
39

kube-openapi

Kubernetes OpenAPI spec generation & serving
Go
315
star
40

kubelet

kubelet component configs
Go
307
star
41

sample-cli-plugin

Sample kubectl plugin
Go
285
star
42

cli-runtime

Set of helpers for creating kubectl commands and plugins.
Go
282
star
43

kube-aggregator

Aggregator for Kubernetes-style API servers: dynamic registration, discovery summarization, secure proxy
Go
249
star
44

cloud-provider

cloud-provider defines the shared interfaces which Kubernetes cloud providers implement. These interfaces allow various controllers to integrate with any cloud provider in a pluggable fashion. Also serves as an issue tracker for SIG Cloud Provider.
Go
243
star
45

org

Meta configuration for Kubernetes Github Org
Go
242
star
46

cloud-provider-vsphere

Kubernetes Cloud Provider for vSphere https://cloud-provider-vsphere.sigs.k8s.io
Go
238
star
47

apiextensions-apiserver

API server for API extensions like CustomResourceDefinitions
Go
231
star
48

kubernetes-template-project

A template for starting new projects on the github.com/kubernetes organization
188
star
49

kube-proxy

kube-proxy component configs
Go
178
star
50

sig-security

Process documentation, non-code deliverables, and miscellaneous artifacts of Kubernetes SIG Security
Python
166
star
51

committee-security-response

Kubernetes Security Process and Security Committee docs
Python
163
star
52

kube-scheduler

kube-scheduler component configs
Go
162
star
53

cloud-provider-gcp

cloud-provider-gcp contains several projects used to run Kubernetes in Google Cloud
Go
115
star
54

component-base

Shared code for kubernetes core components
Go
106
star
55

repo-infra

Kubernetes repository infrastucture tools
Starlark
97
star
56

pod-security-admission

Kubernetes Pod Security Standards implementation - https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/2579-psp-replacement/README.md
Go
97
star
57

kube-controller-manager

kube-controller-manager component configs
Go
88
star
58

steering

The Kubernetes Steering Committee
83
star
59

publishing-bot

Code behind the robot to publish from staging to real repositories.
Go
82
star
60

controller-manager

This repo is intended to contain common public library code for kube-controller-manager, cloud-controller-manager as well as any other controller managers which people build.
Go
68
star
61

contributor-site

Code for kubernetes.dev
HTML
66
star
62

mount-utils

Package mount defines an interface to mounting filesystems.
Go
56
star
63

legacy-cloud-providers

This repository hosts the legacy in-tree cloud providers. Out-of-tree cloud providers can consume packages in this repo to support legacy implementations of their Kubernetes cloud provider.
Go
51
star
64

system-validators

A set of system-oriented validators for kubeadm preflight checks.
Go
34
star
65

cluster-bootstrap

Go
31
star
66

dynamic-resource-allocation

Go
23
star
67

cloud-provider-sample

Sample of how to build a cloud provider repo. This will build a Kubernetes image which deploys on bare metal. It uses the fake cloud provider. It consumes the K8s/K8s build artifact and adds to it the Cloud Controller Manager and CSI Daemon Set.
21
star
68

kms

Kubernetes KMS implementation
Go
18
star
69

node-api

Go
14
star
70

component-helpers

High-level helpers for Kubernetes components
Go
13
star
71

csi-translation-lib

Staging repo for CSI Migration/Translation libraries
Go
12
star
72

cel-admission-webhook

Go
11
star
73

endpointslice

Go
6
star
74

sig-testing

Home for SIG Testing discussion and documents.
6
star
75

cri-client

Container Runtime Interface client implementation
Go
3
star
76

.github

Default files for all repos in the Kubernetes GitHub org
1
star