• Stars
    star
    624
  • Rank 71,995 (Top 2 %)
  • Language
    Go
  • License
    MIT License
  • Created about 8 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deploying Kubernetes on AWS with CloudFormation and Ubuntu

Kubernetes on AWS

WORK IN PROGRESS

This repo contains configuration templates to provision Kubernetes clusters on AWS using Cloud Formation and Ubuntu Linux.

Many values are parameterized and values are not always visible. We're focusing on solving our own, specific/Zalando use case. However, we are open to ideas from the community at large about potentially turning this idea into a project that provides universal/general value to others. Please contact us via our Issues Tracker with your thoughts and suggestions.

Configuration in this repository initially was based on kube-aws, but now depends on four components which aren't all yet open sourced:

  • Cluster Registry to keep desired cluster states (e.g. used config channel and version)
  • Cluster Lifecycle Manager to provision the cluster's Cloud Formation stack and apply Kubernetes manifests for system components
  • Cluster Lifecycle Controller that handles rolling updates from inside the cluster, for example node termination
  • Authnz Webhook to validate OAuth tokens and authorize access

Lean more about Zalando's cloud native journey by reading the Zalando Case Study on kubernetes.io. See our Running Kubernetes in Production on AWS document for details on the setup.

Features

  • Highly available master nodes (ASG) behind ELB
  • Worker Auto Scaling Group with node pools support
  • Flannel overlay networking
  • Cluster autoscaling (using cluster-autoscaler)
  • Kubernetes DNS with node-local dnsmasq as daemonset and CoreDNS resolver for cluster.local domain running in the same pod.
  • Route53 DNS integration via External DNS
  • AWS IAM integration via kube2iam, AWS OIDC IAM
  • Standard components are installed: dashboard, node exporter, kube-state-metrics, see also cluster/manifests directory
  • Webhook authentication and authorization (roles "ReadOnly", "PowerUser", "Manual", "Emergency", "Administrator")
  • Emergency Access via internal emergency-access-service, that grant roles "Manual" and "Emergency" with 4 eyes principle and audit logging
  • Log shipping via Scalyr
  • Full Ingress support with ALB/NLB and TLS integration via kube-ingress-aws-controller and HTTP routing via skipper
  • Enhanced usability with managed stacks and blue green deployments via stackset-controller and skipper
  • Fabric API Gateway, which can be used in combination with stackset-controller
  • Static Egress IPs to route through NAT Gateways with Elastic IPs via kube-static-egress-controller
  • Horizontal Pod Autoscaling with scaling by request per second, SQS queue size or others via kube-metrics-adapter
  • Vertical Pod Autoscaling to scale for example Prometheus
  • EFS support
  • GPU support
  • ETCD backup via Kubernetes cronjob and etcdctl snapshot and upload to S3
  • Monitoring via Prometheus and OpenTracing
  • Fully automated cluster updates via Cluster Lifecycle Manager
  • Automated downscaling for test clusters with kube-downscaler
  • Fallback node pools
  • Spot node pool integration
  • automated PDB creation with pdb-controller

Notes

  • Node and user authentication is done via tokens (using the webhook feature)
  • SSL client-cert authentication is disabled
  • Many values are hardcoded
  • Secrets (e.g. shared token) are not KMS-encrypted in the cluster

Assumptions

  • The AWS account has one or more hosted zones in Route53 including a proper SSL cert (you can use the free ACM service)
  • The VPC has at least one public subnet per AZ (either AWS default VPC setup or public subnet named "dmz-<REGION>-<AZ>")
  • The VPC is in region eu-central-1 or eu-west-1
  • etcd cluster is available via DNS discovery (SRV records) at etcd.<YOUR-HOSTED-ZONE>
  • OAuth Token Info is available to validate user tokens

Directory Structure

  • cluster/cluster.yaml: Cloud Formation template files for the cluster (will be applied by Cluster Lifecycle Manager)
  • cluster/config-defaults.yaml: Default values for different kind of use that can be overridden by values from our cluster-registry (will be applied by Cluster Lifecycle Manager)
  • cluster/etcd-cluster.yaml: Senza Cloud Formation to deploy ETCD
  • cluster/manifests: Kubernetes manifests for system components (will be applied by Cluster Lifecycle Manager)
  • cluster/node-pools: Cloud Formation template files and userdata (cloud-init) for ContainerLinux node-pools (will be applied by Cluster Lifecycle Manager)
  • docs: extracts from internal Zalando documentation.

More Repositories

1

graphql-jit

GraphQL execution using a JIT compiler
TypeScript
1,048
star
2

kopf

A Python framework to write Kubernetes operators in just few lines of code.
Python
969
star
3

kube-metrics-adapter

General purpose metrics adapter for Kubernetes HPA metrics
Go
521
star
4

kube-ingress-aws-controller

Configures AWS Load Balancers according to Kubernetes Ingress resources
Go
375
star
5

es-operator

Kubernetes Operator for Elasticsearch
Go
352
star
6

hexo-theme-doc

A documentation theme for the Hexo blog framework
JavaScript
247
star
7

cluster-lifecycle-manager

Cluster Lifecycle Manager (CLM) to provision and update multiple Kubernetes clusters
Go
230
star
8

docker-locust

Docker image for the Locust.io open source load testing tool
Python
205
star
9

remora

Kafka consumer lag-checking application for monitoring, written in Scala and Akka HTTP; a wrap around the Kafka consumer group command. Integrations with Cloudwatch and Datadog. Authentication recently added
Scala
197
star
10

stackset-controller

Opinionated StackSet resource for managing application life cycle and traffic switching in Kubernetes
Go
171
star
11

kube-aws-iam-controller

Distribute different AWS IAM credentials to different pods in Kubernetes via secrets.
Go
157
star
12

tessellate

Server-side React render service.
JavaScript
151
star
13

transformer

A tool to transform/convert web browser sessions (HAR files) into Locust load testing scenarios (locustfile).
Python
99
star
14

bro-q

Chrome Extension for JSON formatting and jq filtering in your browser.
TypeScript
83
star
15

spark-json-schema

JSON schema parser for Apache Spark
Scala
81
star
16

catwatch

A metrics dashboard for GitHub organizations, with results accessible via REST API
Java
59
star
17

authmosphere

A library to support OAuth2 workflows in JavaScript projects
TypeScript
54
star
18

banknote

A simple JavaScript libary for formatting currency amounts according to Unicode CLDR standards
JavaScript
46
star
19

flatjson

A fast JSON parser (and builder)
Java
45
star
20

perron

A sane node.js client for web services
JavaScript
44
star
21

zelt

A command-line tool for orchestrating the deployment of Locust in Kubernetes.
Python
36
star
22

hexo-theme-doc-seed

skeleton structure for a documentation website using Hexo and the hexo-doc-theme
29
star
23

kubernetes-log-watcher

Kubernetes log watcher for Scalyr and AppDynamics
Python
27
star
24

new-project

Template to use when creating a new open source project. It comes with all the standard files which there is expected to be in an open source project on Github.
24
star
25

darty

Data dependency manager
Python
22
star
26

chisel

โš’๏ธ collection of awesome practices for putting things on pedestal
Clojure
20
star
27

fabric-gateway

An API Gateway built on the Skipper Ingress Controller https://github.com/zalando/skipper
Scala
17
star
28

roadblock

A node.js application for pulling github organisation statistics into a database.
JavaScript
16
star
29

ember-dressy-table

An ember addon for dynamic tables
JavaScript
10
star
30

zalando.github.io-dev

The zalando.github.io open-source metrics dashboard
JavaScript
10
star
31

atlas-js-core

JavaScript SDK Core for Zalando Checkout, Guest Checkout, and Catalog APIs
JavaScript
9
star
32

opentracing-sqs-java

An attempt at a simple SQS helper library for OpenTracing support.
Java
8
star
33

clin

Cli for Nakadi for event types and subscriptions management
Python
7
star
34

play-etcd-watcher

Instantaneous etcd directory listener for Scala Play
Scala
6
star
35

Zincr

Zincr is a Github bot built with Probot to enforce approvals, specification and licensing checks
TypeScript
5
star
36

jzon

Apis for working with json
Java
5
star
37

Trafficlight

Node.js CLI for creating and migrating Github projects, ensuring that it follows a consistent model for permissions, teams and boilerplate files.
JavaScript
1
star