• Stars
    star
    4,878
  • Rank 8,608 (Top 0.2 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created about 6 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)

flagger

release CII Best Practices report FOSSA Status Artifact Hub CLOMonitor

Flagger is a progressive delivery tool that automates the release process for applications running on Kubernetes. It reduces the risk of introducing a new software version in production by gradually shifting traffic to the new version while measuring metrics and running conformance tests.

flagger-overview

Flagger implements several deployment strategies (Canary releases, A/B testing, Blue/Green mirroring) and integrates with various Kubernetes ingress controllers, service mesh, and monitoring solutions.

Flagger is a Cloud Native Computing Foundation project and part of the Flux family of GitOps tools.

Documentation

Flagger documentation can be found at fluxcd.io/flagger.

Adopters

Our list of production users has moved to https://fluxcd.io/adopters/#flagger.

If you are using Flagger, please submit a PR to add your organization to the list!

Canary CRD

Flagger takes a Kubernetes deployment and optionally a horizontal pod autoscaler (HPA), then creates a series of objects (Kubernetes deployments, ClusterIP services, service mesh, or ingress routes). These objects expose the application on the mesh and drive the canary analysis and promotion.

Flagger keeps track of ConfigMaps and Secrets referenced by a Kubernetes Deployment and triggers a canary analysis if any of those objects change. When promoting a workload in production, both code (container images) and configuration (config maps and secrets) are being synchronized.

For a deployment named podinfo, a canary promotion can be defined using Flagger's custom resource:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: podinfo
  namespace: test
spec:
  # service mesh provider (optional)
  # can be: kubernetes, istio, linkerd, appmesh, nginx, skipper, contour, gloo, supergloo, traefik, osm
  # for SMI TrafficSplit can be: smi:v1alpha1, smi:v1alpha2, smi:v1alpha3
  provider: istio
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    name: podinfo
  service:
    # service name (defaults to targetRef.name)
    name: podinfo
    # ClusterIP port number
    port: 9898
    # container port name or number (optional)
    targetPort: 9898
    # port name can be http or grpc (default http)
    portName: http
    # add all the other container ports
    # to the ClusterIP services (default false)
    portDiscovery: true
    # HTTP match conditions (optional)
    match:
      - uri:
          prefix: /
    # HTTP rewrite (optional)
    rewrite:
      uri: /
    # request timeout (optional)
    timeout: 5s
  # promote the canary without analysing it (default false)
  skipAnalysis: false
  # define the canary analysis timing and KPIs
  analysis:
    # schedule interval (default 60s)
    interval: 1m
    # max number of failed metric checks before rollback
    threshold: 10
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 50
    # canary increment step
    # percentage (0-100)
    stepWeight: 5
    # validation (optional)
    metrics:
    - name: request-success-rate
      # builtin Prometheus check
      # minimum req success rate (non 5xx responses)
      # percentage (0-100)
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      # builtin Prometheus check
      # maximum req duration P99
      # milliseconds
      thresholdRange:
        max: 500
      interval: 30s
    - name: "database connections"
      # custom metric check
      templateRef:
        name: db-connections
      thresholdRange:
        min: 2
        max: 100
      interval: 1m
    # testing (optional)
    webhooks:
      - name: "conformance test"
        type: pre-rollout
        url: http://flagger-helmtester.test/
        timeout: 5m
        metadata:
          type: "helmv3"
          cmd: "test run podinfo -n test"
      - name: "load test"
        type: rollout
        url: http://flagger-loadtester.test/
        metadata:
          cmd: "hey -z 1m -q 10 -c 2 http://podinfo.test:9898/"
    # alerting (optional)
    alerts:
      - name: "dev team Slack"
        severity: error
        providerRef:
          name: dev-slack
          namespace: flagger
      - name: "qa team Discord"
        severity: warn
        providerRef:
          name: qa-discord
      - name: "on-call MS Teams"
        severity: info
        providerRef:
          name: on-call-msteams

For more details on how the canary analysis and promotion works please read the docs.

Features

Service Mesh

Feature App Mesh Istio Linkerd Kuma OSM Kubernetes CNI
Canary deployments (weighted traffic) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โž–
A/B testing (headers and cookies routing) โœ”๏ธ โœ”๏ธ โž– โž– โž– โž–
Blue/Green deployments (traffic switch) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
Blue/Green deployments (traffic mirroring) โž– โœ”๏ธ โž– โž– โž– โž–
Webhooks (acceptance/load testing) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
Manual gating (approve/pause/resume) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
Request success rate check (L7 metric) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โž–
Request duration check (L7 metric) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โž–
Custom metric checks โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ

Ingress

Feature Contour Gloo NGINX Skipper Traefik Apache APISIX
Canary deployments (weighted traffic) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
A/B testing (headers and cookies routing) โœ”๏ธ โœ”๏ธ โœ”๏ธ โž– โž– โž–
Blue/Green deployments (traffic switch) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
Webhooks (acceptance/load testing) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
Manual gating (approve/pause/resume) โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ
Request success rate check (L7 metric) โœ”๏ธ โœ”๏ธ โž– โœ”๏ธ โœ”๏ธ โœ”๏ธ
Request duration check (L7 metric) โœ”๏ธ โœ”๏ธ โž– โœ”๏ธ โœ”๏ธ โœ”๏ธ
Custom metric checks โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ

Networking Interface

Feature Gateway API SMI
Canary deployments (weighted traffic) โœ”๏ธ โœ”๏ธ
A/B testing (headers and cookies routing) โœ”๏ธ โž–
Blue/Green deployments (traffic switch) โœ”๏ธ โœ”๏ธ
Blue/Green deployments (traffic mirrroring) โž– โž–
Webhooks (acceptance/load testing) โœ”๏ธ โœ”๏ธ
Manual gating (approve/pause/resume) โœ”๏ธ โœ”๏ธ
Request success rate check (L7 metric) โž– โž–
Request duration check (L7 metric) โž– โž–
Custom metric checks โœ”๏ธ โœ”๏ธ

For all Gateway API implementations like Contour or Istio and SMI compatible service mesh solutions like Nginx Service Mesh, Prometheus MetricTemplates can be used to implement the request success rate and request duration checks.

Roadmap

GitOps Toolkit compatibility

  • Migrate Flagger to Kubernetes controller-runtime and kubebuilder
  • Make the Canary status compatible with kstatus
  • Make Flagger emit Kubernetes events compatible with Flux v2 notification API
  • Integrate Flagger into Flux v2 as the progressive delivery component

Integrations

  • Add support for ingress controllers like HAProxy, ALB, and Apache APISIX
  • Add support for Knative Serving

Contributing

Flagger is Apache 2.0 licensed and accepts contributions via GitHub pull requests. To start contributing please read the development guide.

When submitting bug reports please include as many details as possible:

  • which Flagger version
  • which Kubernetes version
  • what configuration (canary, ingress and workloads definitions)
  • what happened (Flagger and Proxy logs)

Communication

Here is a list of good entry points into our community, how we stay in touch and how you can meet us as a team.

  • Slack: Join in and talk to us in the #flagger channel on CNCF Slack.
  • Public meetings: We run weekly meetings - join one of the upcoming dev meetings from the Flux calendar.
  • Blog: Stay up to date with the latest news on the Flux blog.
  • Mailing list: To be updated on Flux and Flagger progress regularly, please join the flux-dev mailing list.

Subscribing to the flux-dev calendar

To add the meetings to your e.g. Google calendar

  1. visit the Flux calendar
  2. click on "Subscribe to Calendar" at the very bottom of the page
  3. copy the iCalendar URL
  4. open e.g. your Google calendar
  5. find the "add calendar" option
  6. choose "add by URL"
  7. paste iCalendar URL (ends with .ics)
  8. done

More Repositories

1

flux

Successor: https://github.com/fluxcd/flux2
Go
6,897
star
2

flux2

Open and extensible continuous delivery solution for Kubernetes. Powered by GitOps Toolkit.
Go
5,085
star
3

flux2-kustomize-helm-example

A GitOps workflow example for multi-env deployments with Flux, Kustomize and Helm.
Shell
694
star
4

helm-operator

Successor: https://github.com/fluxcd/helm-controller โ€” The Flux Helm Operator, once upon a time a solution for declarative Helming.
Go
649
star
5

helm-operator-get-started

Managing Helm releases with Flux Helm Operator
HTML
455
star
6

flux2-multi-tenancy

Manage multi-tenant clusters with Flux
Shell
418
star
7

helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
Go
371
star
8

terraform-provider-flux

Terraform and OpenTofu provider for bootstrapping Flux
Go
358
star
9

source-controller

The GitOps Toolkit source management component
Go
237
star
10

kustomize-controller

The GitOps Toolkit Kustomize reconciler
Go
216
star
11

multi-tenancy

Flux v1: Manage a multi-tenant cluster with Flux and Kustomize
Open Policy Agent
181
star
12

flux-get-started

Flux v1: Getting started with Flux and the Helm Operator
HTML
154
star
13

notification-controller

The GitOps Toolkit event forwarder and notification dispatcher
Go
150
star
14

webui

TypeScript
149
star
15

image-automation-controller

GitOps Toolkit controller that patches container image tags in Git
Go
131
star
16

image-reflector-controller

GitOps Toolkit controller that scans container registries
Go
85
star
17

flux-kustomize-example

Flux v1: Example of Flux using manifest generation with Kustomize
74
star
18

go-git-providers

Git provider client for Go
Go
72
star
19

flux-recv

Webhook receiver for Flux v1
Go
64
star
20

cues

Experimental CUE packages for generating Flux configurations
CUE
51
star
21

website

The Flux website and user documentation
HTML
50
star
22

pkg

Toolkit common packages
Go
48
star
23

community

Flux community content
Python
36
star
24

source-watcher

Example consumer of the GitOps Toolkit Source APIs
Go
30
star
25

flux-benchmark

Mean Time To Production benchmark for Flux
CUE
22
star
26

charts

Helm repository for Flux and Helm Operator charts
15
star
27

fluxctl-action

A GitHub Action to run fluxctl commands
Shell
15
star
28

gitsrv

Alpine git server used for Flux and Helm Operator end-to-end testing
Shell
12
star
29

multi-tenancy-team1

Tenant example repository
Open Policy Agent
10
star
30

flux2-monitoring-example

Prometheus monitoring for the Flux control plane
Shell
8
star
31

homebrew-tap

Homebrew formulas
Ruby
6
star
32

.github

5
star
33

golang-with-libgit2

Golang builder image, but with libgit2 included.
Go
4
star
34

stats

Flux project usage statistics
2
star
35

test-infra

Test infrastructure for the Flux project
Go
1
star