• Stars
    star
    2,706
  • Rank 16,186 (Top 0.4 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created almost 8 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This is a place for various problem detectors running on the Kubernetes nodes.

node-problem-detector

Build Status Go Report Card

node-problem-detector aims to make various node problems visible to the upstream layers in the cluster management stack. It is a daemon that runs on each node, detects node problems and reports them to apiserver. node-problem-detector can either run as a DaemonSet or run standalone. Now it is running as a Kubernetes Addon enabled by default in the GKE cluster. It is also enabled by default in AKS as part of the AKS Linux Extension.

Background

There are tons of node problems that could possibly affect the pods running on the node, such as:

  • Infrastructure daemon issues: ntp service down;
  • Hardware issues: Bad CPU, memory or disk;
  • Kernel issues: Kernel deadlock, corrupted file system;
  • Container runtime issues: Unresponsive runtime daemon;
  • ...

Currently, these problems are invisible to the upstream layers in the cluster management stack, so Kubernetes will continue scheduling pods to the bad nodes.

To solve this problem, we introduced this new daemon node-problem-detector to collect node problems from various daemons and make them visible to the upstream layers. Once upstream layers have visibility to those problems, we can discuss the remedy system.

Problem API

node-problem-detector uses Event and NodeCondition to report problems to apiserver.

  • NodeCondition: Permanent problem that makes the node unavailable for pods should be reported as NodeCondition.
  • Event: Temporary problem that has limited impact on pod but is informative should be reported as Event.

Problem Daemon

A problem daemon is a sub-daemon of node-problem-detector. It monitors specific kinds of node problems and reports them to node-problem-detector.

A problem daemon could be:

  • A tiny daemon designed for dedicated Kubernetes use-cases.
  • An existing node health monitoring daemon integrated with node-problem-detector.

Currently, a problem daemon is running as a goroutine in the node-problem-detector binary. In the future, we'll separate node-problem-detector and problem daemons into different containers, and compose them with pod specification.

Each category of problem daemon can be disabled at compilation time by setting corresponding build tags. If they are disabled at compilation time, then all their build dependencies, global variables and background goroutines will be trimmed out of the compiled executable.

List of supported problem daemons types:

Problem Daemon Types NodeCondition Description Configs Disabling Build Tag
SystemLogMonitor KernelDeadlock ReadonlyFilesystem FrequentKubeletRestart FrequentDockerRestart FrequentContainerdRestart A system log monitor monitors system log and reports problems and metrics according to predefined rules. filelog, kmsg, kernel abrt systemd disable_system_log_monitor
SystemStatsMonitor None(Could be added in the future) A system stats monitor for node-problem-detector to collect various health-related system stats as metrics. See the proposal here. disable_system_stats_monitor
CustomPluginMonitor On-demand(According to users configuration), existing example: NTPProblem A custom plugin monitor for node-problem-detector to invoke and check various node problems with user-defined check scripts. See the proposal here. example disable_custom_plugin_monitor
HealthChecker KubeletUnhealthy ContainerRuntimeUnhealthy A health checker for node-problem-detector to check kubelet and container runtime health. kubelet docker

Exporter

An exporter is a component of node-problem-detector. It reports node problems and/or metrics to certain backends. Some of them can be disabled at compile-time using a build tag. List of supported exporters:

Exporter Description Disabling Build Tag
Kubernetes exporter Kubernetes exporter reports node problems to Kubernetes API server: temporary problems get reported as Events, and permanent problems get reported as Node Conditions.
Prometheus exporter Prometheus exporter reports node problems and metrics locally as Prometheus metrics
Stackdriver exporter Stackdriver exporter reports node problems and metrics to Stackdriver Monitoring API. disable_stackdriver_exporter

Usage

Flags

  • --version: Print current version of node-problem-detector.
  • --hostname-override: A customized node name used for node-problem-detector to update conditions and emit events. node-problem-detector gets node name first from hostname-override, then NODE_NAME environment variable and finally fall back to os.Hostname.

For System Log Monitor

  • --config.system-log-monitor: List of paths to system log monitor configuration files, comma-separated, e.g. config/kernel-monitor.json. Node problem detector will start a separate log monitor for each configuration. You can use different log monitors to monitor different system logs.

For System Stats Monitor

  • --config.system-stats-monitor: List of paths to system stats monitor config files, comma-separated, e.g. config/system-stats-monitor.json. Node problem detector will start a separate system stats monitor for each configuration. You can use different system stats monitors to monitor different problem-related system stats.

For Custom Plugin Monitor

  • --config.custom-plugin-monitor: List of paths to custom plugin monitor config files, comma-separated, e.g. config/custom-plugin-monitor.json. Node problem detector will start a separate custom plugin monitor for each configuration. You can use different custom plugin monitors to monitor different node problems.

For Kubernetes exporter

  • --enable-k8s-exporter: Enables reporting to Kubernetes API server, default to true.
  • --apiserver-override: A URI parameter used to customize how node-problem-detector connects the apiserver. This is ignored if --enable-k8s-exporter is false. The format is the same as the source flag of Heapster. For example, to run without auth, use the following config:
    http://APISERVER_IP:APISERVER_PORT?inClusterConfig=false
    
    Refer to heapster docs for a complete list of available options.
  • --address: The address to bind the node problem detector server.
  • --port: The port to bind the node problem detector server. Use 0 to disable.

For Prometheus exporter

  • --prometheus-address: The address to bind the Prometheus scrape endpoint, default to 127.0.0.1.
  • --prometheus-port: The port to bind the Prometheus scrape endpoint, default to 20257. Use 0 to disable.

For Stackdriver exporter

Deprecated Flags

  • --system-log-monitors: List of paths to system log monitor config files, comma-separated. This option is deprecated, replaced by --config.system-log-monitor, and will be removed. NPD will panic if both --system-log-monitors and --config.system-log-monitor are set.

  • --custom-plugin-monitors: List of paths to custom plugin monitor config files, comma-separated. This option is deprecated, replaced by --config.custom-plugin-monitor, and will be removed. NPD will panic if both --custom-plugin-monitors and --config.custom-plugin-monitor are set.

Build Image

  • Install development dependencies for libsystemd and the ARM GCC toolchain

    • Debian/Ubuntu: apt install libsystemd-dev gcc-aarch64-linux-gnu
  • git clone [email protected]:kubernetes/node-problem-detector.git

  • Run make in the top directory. It will:

    • Build the binary.
    • Build the docker image. The binary and config/ are copied into the docker image.

If you do not need certain categories of problem daemons, you could choose to disable them at compilation time. This is the best way of keeping your node-problem-detector runtime compact without unnecessary code (e.g. global variables, goroutines, etc). You can do so via setting the BUILD_TAGS environment variable before running make. For example:

BUILD_TAGS="disable_custom_plugin_monitor disable_system_stats_monitor" make

The above command will compile the node-problem-detector without Custom Plugin Monitor and System Stats Monitor. Check out the Problem Daemon section to see how to disable each problem daemon during compilation time.

Push Image

make push uploads the docker image to a registry. By default, the image will be uploaded to staging-k8s.gcr.io. It's easy to modify the Makefile to push the image to another registry.

Installation

The easiest way to install node-problem-detector into your cluster is to use the Helm chart:

helm repo add deliveryhero https://charts.deliveryhero.io/
helm install --generate-name deliveryhero/node-problem-detector

Alternatively, to install node-problem-detector manually:

  1. Edit node-problem-detector.yaml to fit your environment. Set log volume to your system log directory (used by SystemLogMonitor). You can use a ConfigMap to overwrite the config directory inside the pod.

  2. Edit node-problem-detector-config.yaml to configure node-problem-detector.

  3. Edit rbac.yaml to fit your environment.

  4. Create the ServiceAccount and ClusterRoleBinding with kubectl create -f rbac.yaml.

  5. Create the ConfigMap with kubectl create -f node-problem-detector-config.yaml.

  6. Create the DaemonSet with kubectl create -f node-problem-detector.yaml.

Start Standalone

To run node-problem-detector standalone, you should set inClusterConfig to false and teach node-problem-detector how to access apiserver with apiserver-override.

To run node-problem-detector standalone with an insecure apiserver connection:

node-problem-detector --apiserver-override=http://APISERVER_IP:APISERVER_INSECURE_PORT?inClusterConfig=false

For more scenarios, see here

Windows

Node Problem Detector has preliminary support Windows. Most of the functionality has not been tested but filelog plugin works.

Follow Issue #461 for development status of Windows support.

Development

To develop NPD on Windows you'll need to setup your Windows machine for Go development. Install the following tools:

# Run these commands in the node-problem-detector directory.

# Build in MINGW64 Window
make clean ENABLE_JOURNALD=0 build-binaries

# Test in MINGW64 Window
make test

# Run with containerd log monitoring enabled in Command Prompt. (Assumes containerd is installed.)
%CD%\output\windows_amd64\bin\node-problem-detector.exe --logtostderr --enable-k8s-exporter=false --config.system-log-monitor=%CD%\config\windows-containerd-monitor-filelog.json --config.system-stats-monitor=config\windows-system-stats-monitor.json

# Configure NPD to run as a Windows Service
sc.exe create NodeProblemDetector binpath= "%CD%\node-problem-detector.exe [FLAGS]" start= demand 
sc.exe failure NodeProblemDetector reset= 0 actions= restart/10000
sc.exe start NodeProblemDetector

Try It Out

You can try node-problem-detector in a running cluster by injecting messages to the logs that node-problem-detector is watching. For example, Let's assume node-problem-detector is using KernelMonitor. On your workstation, run kubectl get events -w. On the node, run sudo sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg". Then you should see the KernelOops event.

When adding new rules or developing node-problem-detector, it is probably easier to test it on the local workstation in the standalone mode. For the API server, an easy way is to use kubectl proxy to make a running cluster's API server available locally. You will get some errors because your local workstation is not recognized by the API server. But you should still be able to test your new rules regardless.

For example, to test KernelMonitor rules:

  1. make (build node-problem-detector locally)
  2. kubectl proxy --port=8080 (make a running cluster's API server available locally)
  3. Update KernelMonitor's logPath to your local kernel log directory. For example, on some Linux systems, it is /run/log/journal instead of /var/log/journal.
  4. ./bin/node-problem-detector --logtostderr --apiserver-override=http://127.0.0.1:8080?inClusterConfig=false --config.system-log-monitor=config/kernel-monitor.json --config.system-stats-monitor=config/system-stats-monitor.json --port=20256 --prometheus-port=20257 (or point to any API server address:port and Prometheus port)
  5. sudo sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg"
  6. You can see KernelOops event in the node-problem-detector log.
  7. sudo sh -c "echo 'kernel: INFO: task docker:20744 blocked for more than 120 seconds.' >> /dev/kmsg"
  8. You can see DockerHung event and condition in the node-problem-detector log.
  9. You can see DockerHung condition at http://127.0.0.1:20256/conditions.
  10. You can see disk-related system metrics in Prometheus format at http://127.0.0.1:20257/metrics.

Note:

  • You can see more rule examples under test/kernel_log_generator/problems.
  • For KernelMonitor message injection, all messages should have kernel: prefix (also note there is a space after :); or use generator.sh.
  • To inject other logs into journald like systemd logs, use echo 'Some systemd message' | systemd-cat -t systemd.

Dependency Management

node-problem-detector uses go modules to manage dependencies. Therefore, building node-problem-detector requires golang 1.11+. It still uses vendoring. See the Kubernetes go modules KEP for the design decisions. To add a new dependency, update go.mod and run GO111MODULE=on go mod vendor.

Remedy Systems

A remedy system is a process or processes designed to attempt to remedy problems detected by the node-problem-detector. Remedy systems observe events and/or node conditions emitted by the node-problem-detector and take action to return the Kubernetes cluster to a healthy state. The following remedy systems exist:

  • Draino automatically drains Kubernetes nodes based on labels and node conditions. Nodes that match all of the supplied labels and any of the supplied node conditions will be prevented from accepting new pods (aka 'cordoned') immediately, and drained after a configurable time. Draino can be used in conjunction with the Cluster Autoscaler to automatically terminate drained nodes. Refer to this issue for an example production use case for Draino.
  • Descheduler strategy RemovePodsViolatingNodeTaints evicts pods violating NoSchedule taints on nodes. The k8s scheduler's TaintNodesByCondition feature must be enabled. The Cluster Autoscaler can be used to automatically terminate drained nodes.
  • mediK8S is an umbrella project for automatic remediation system build on Node Health Check Operator (NHC) that monitors node conditions and delegates remediation to external remediators using the Remediation API.Poison-Pill is a remediator that will reboot the node and make sure all statefull workloads are rescheduled. NHC supports conditionally remediating if the cluster has enough healthy capacity, or manually pausing any action to minimze cluster disruption.

Testing

NPD is tested via unit tests, NPD e2e tests, Kubernetes e2e tests and Kubernetes nodes e2e tests. Prow handles the pre-submit tests and CI tests.

CI test results can be found below:

  1. Unit tests
  2. NPD e2e tests
  3. Kubernetes e2e tests
  4. Kubernetes nodes e2e tests

Running tests

Unit tests are run via make test.

See NPD e2e test documentation for how to set up and run NPD e2e tests.

Problem Maker

Problem maker is a program used in NPD e2e tests to generate/simulate node problems. It is ONLY intended to be used by NPD e2e tests. Please do NOT run it on your workstation, as it could cause real node problems.

Docs

Links

More Repositories

1

kubernetes

Production-Grade Container Scheduling and Management
Go
105,869
star
2

minikube

Run Kubernetes locally
Go
28,262
star
3

ingress-nginx

Ingress-NGINX Controller for Kubernetes
Go
16,503
star
4

kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
Go
15,486
star
5

dashboard

General-purpose web UI for Kubernetes clusters
Go
13,699
star
6

community

Kubernetes community content
Jupyter Notebook
11,570
star
7

kompose

Convert Compose to Kubernetes
Go
9,056
star
8

client-go

Go client for Kubernetes.
Go
8,516
star
9

autoscaler

Autoscaling components for Kubernetes
Go
7,628
star
10

examples

Kubernetes application example tutorials
Shell
5,992
star
11

kube-state-metrics

Add-on agent to generate and expose cluster-level metrics.
Go
5,018
star
12

website

Kubernetes website and documentation repo:
HTML
4,237
star
13

test-infra

Test infrastructure for the Kubernetes project.
Go
3,775
star
14

kubeadm

Aggregator for issues filed against kubeadm
Go
3,632
star
15

enhancements

Enhancements tracking repo for Kubernetes
Go
3,220
star
16

sample-controller

Repository for sample controller. Complements sample-apiserver
Go
2,987
star
17

kubectl

Issue tracker and mirror of kubectl code
Go
2,653
star
18

git-sync

A sidecar app which clones a git repo and keeps it in sync with the upstream.
Shell
1,994
star
19

code-generator

Generators for kube-like API types
Go
1,596
star
20

ingress-gce

Ingress controller for Google Cloud
Go
1,248
star
21

dns

Kubernetes DNS service
Go
871
star
22

perf-tests

Performance tests and benchmarks
Go
849
star
23

apimachinery

Go
774
star
24

k8s.io

Code and configuration to manage Kubernetes project infrastructure, including various *.k8s.io sites
HCL
667
star
25

apiserver

Library for writing a Kubernetes-style API server.
Go
613
star
26

api

The canonical location of the Kubernetes API definition.
Go
609
star
27

cloud-provider-openstack

Go
582
star
28

gengo

gengo library for code generation.
Go
533
star
29

sig-release

Repo for SIG release
Shell
512
star
30

sample-apiserver

Reference implementation of an apiserver for a custom Kubernetes API.
Go
496
star
31

metrics

Kubernetes metrics-related API types and clients
Go
478
star
32

release

Release infrastructure for Kubernetes and related components
Go
470
star
33

design-proposals-archive

Archive of Kubernetes Design Proposals
Makefile
442
star
34

cri-api

Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes.
Go
357
star
35

cloud-provider-aws

Cloud provider for AWS
Go
350
star
36

cloud-provider-alibaba-cloud

CloudProvider for Alibaba Cloud
Go
345
star
37

registry.k8s.io

This project is the repo for registry.k8s.io, the production OCI registry service for Kubernetes' container image artifacts
Go
340
star
38

utils

Non-Kubernetes-specific utility libraries which are consumed by multiple projects.
Go
306
star
39

kube-openapi

Kubernetes OpenAPI spec generation & serving
Go
289
star
40

kubelet

kubelet component configs
Go
281
star
41

sample-cli-plugin

Sample kubectl plugin
Go
278
star
42

cli-runtime

Set of helpers for creating kubectl commands and plugins.
Go
270
star
43

kube-aggregator

Aggregator for Kubernetes-style API servers: dynamic registration, discovery summarization, secure proxy
Go
242
star
44

org

Meta configuration for Kubernetes Github Org
Go
232
star
45

apiextensions-apiserver

API server for API extensions like CustomResourceDefinitions
Go
223
star
46

cloud-provider-vsphere

Kubernetes Cloud Provider for vSphere https://cloud-provider-vsphere.sigs.k8s.io
Go
221
star
47

cloud-provider

cloud-provider defines the shared interfaces which Kubernetes cloud providers implement. These interfaces allow various controllers to integrate with any cloud provider in a pluggable fashion. Also serves as an issue tracker for SIG Cloud Provider.
Go
219
star
48

kubernetes-template-project

A template for starting new projects on the github.com/kubernetes organization
176
star
49

kube-proxy

kube-proxy component configs
Go
166
star
50

committee-security-response

Kubernetes Security Process and Security Committee docs
161
star
51

kube-scheduler

kube-scheduler component configs
Go
150
star
52

sig-security

Process documentation, non-code deliverables, and miscellaneous artifacts of Kubernetes SIG Security
Python
146
star
53

component-base

Shared code for kubernetes core components
Go
102
star
54

repo-infra

Kubernetes repository infrastucture tools
Starlark
95
star
55

cloud-provider-gcp

cloud-provider-gcp contains several projects used to run Kubernetes in Google Cloud
Go
95
star
56

pod-security-admission

Kubernetes Pod Security Standards implementation - https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/2579-psp-replacement/README.md
Go
94
star
57

kube-controller-manager

kube-controller-manager component configs
Go
81
star
58

publishing-bot

Code behind the robot to publish from staging to real repositories.
Go
79
star
59

steering

The Kubernetes Steering Committee
79
star
60

controller-manager

This repo is intended to contain common public library code for kube-controller-manager, cloud-controller-manager as well as any other controller managers which people build.
Go
61
star
61

contributor-site

Code for kubernetes.dev
HTML
58
star
62

mount-utils

Package mount defines an interface to mounting filesystems.
Go
50
star
63

legacy-cloud-providers

This repository hosts the legacy in-tree cloud providers. Out-of-tree cloud providers can consume packages in this repo to support legacy implementations of their Kubernetes cloud provider.
Go
49
star
64

cluster-bootstrap

Go
32
star
65

system-validators

A set of system-oriented validators for kubeadm preflight checks.
Go
32
star
66

dynamic-resource-allocation

Go
20
star
67

kms

Kubernetes KMS implementation
Go
18
star
68

cloud-provider-sample

Sample of how to build a cloud provider repo. This will build a Kubernetes image which deploys on bare metal. It uses the fake cloud provider. It consumes the K8s/K8s build artifact and adds to it the Cloud Controller Manager and CSI Daemon Set.
18
star
69

node-api

Go
15
star
70

component-helpers

High-level helpers for Kubernetes components
Go
14
star
71

cel-admission-webhook

Go
12
star
72

csi-translation-lib

Staging repo for CSI Migration/Translation libraries
Go
12
star
73

endpointslice

Go
7
star
74

sig-testing

Home for SIG Testing discussion and documents.
5
star
75

.github

Default files for all repos in the Kubernetes GitHub org
2
star