page_type | languages | products | name | description | urlFragment | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sample |
|
|
Create a private Azure Kubernetes Service cluster using Terraform and Azure DevOps |
This sample shows how to create a private AKS cluster using Terraform and Azure DevOps in a hub and spoke network topology with Azure Firewall. |
private-aks-cluster-terraform-devops |
Create a private Azure Kubernetes Service cluster using Terraform and Azure DevOps
This sample shows how to create a private AKS clusters using:
- Terraform as infrastructure as code (IaC) tool to build, change, and version the infrastructure on Azure in a safe, repeatable, and efficient way.
- Azure DevOps Pipelines to automate the deployment and undeployment of the entire infrastructure on multiple environments on the Azure platform.
In a private AKS cluster, the API server endpoint is not exposed via a public IP address. Hence, to manage the API server, you will need to use a virtual machine that has access to the AKS cluster's Azure Virtual Network (VNet). This sample deploys a jumpbox virtual machine in the hub virtual network peered with the virtual network that hosts the private AKS cluster. There are several options for establishing network connectivity to the private cluster.
- Create a virtual machine in the same Azure Virtual Network (VNet) as the AKS cluster.
- Use a virtual machine in a separate network and set up Virtual network peering. See the section below for more information on this option.
- Use an Express Route or VPN connection.
Creating a virtual machine in the same virtual network as the AKS cluster or in a peered virtual network is the easiest option. Express Route and VPNs add costs and require additional networking complexity. Virtual network peering requires you to plan your network CIDR ranges to ensure there are no overlapping ranges. For more information, see Create a private Azure Kubernetes Service cluster. For more information on Azure Private Links, see What is Azure Private Link?
In addition, the sample creates a private endpoint to access all the managed services deployed by the Terraform modules via a private IP address:
- Azure Container Registry
- Azure Storage Account
- Azure Key Vault
NOTE
If you want to deploy a private AKS cluster using a public DNS zone to simplify the DNS resolution of the API Server to the private IP address of the private endpoint, you can use this project under my GitHub account or on Azure Quickstart Templates.
Architecture
The following picture shows the high-level architecture created by the Terraform modules included in this sample:
The following picture provides a more detailed view of the infrastructure on Azure.
The architecture is composed of the following elements:
- A hub virtual network with two subnets:
- AzureBastionSubnet used by Azure Bastion
- AzureFirewallSubnet used by Azure Firewall
- A new virtual network with three subnets:
- SystemSubnet used by the AKS system node pool
- UserSubnet used by the AKS user node pool
- VmSubnet used by the jumpbox virtual machine and private endpoints
- The private AKS cluster uses a user-defined managed identity to create additional resources like load balancers and managed disks in Azure.
- The private AKS cluster is composed of a:
- System node pool hosting only critical system pods and services. The worker nodes have node taint which prevents application pods from beings scheduled on this node pool.
- User node pool hosting user workloads and artifacts.
- An Azure Firewall used to control the egress traffic from the private AKS cluster. For more information on how to lock down your private AKS cluster and filter outbound traffic, see:
- An AKS cluster with a private endpoint to the API server hosted by an AKS-managed Azure subscription. The cluster can communicate with the API server exposed via a Private Link Service using a private endpoint.
- An Azure Bastion resource that provides secure and seamless SSH connectivity to the Vm virtual machine directly in the Azure portal over SSL
- An Azure Container Registry (ACR) to build, store, and manage container images and artifacts in a private registry for all types of container deployments.
- When the ACR SKU is equal to Premium, a Private Endpoint is created to allow the private AKS cluster to access ACR via a private IP address. For more information, see Connect privately to an Azure container registry using Azure Private Link.
- A jumpbox virtual machine used to manage the Azure Kubernetes Service cluster
- A Private DNS Zone for the name resolution of each private endpoint.
- A Virtual Network Link between each Private DNS Zone and both the hub and spoke virtual networks
- A Log Analytics workspace to collect the diagnostics logs and metrics of both the AKS cluster and Vm virtual machine.
Limitations
A private AKS cluster has the following limitations:
- IP authorized ranges can't be applied to the private api server endpoint, they only apply to the public API server
- Azure Private Link service limitations apply to private AKS clusters.
- No support for Azure DevOps Microsoft-hosted agents with private clusters. Consider to use Self-hosted Agents.
- For customers that need to enable Azure Container Registry to work with private AKS cluster, the Container Registry virtual network must be peered with the agent cluster virtual network.
- No support for converting existing AKS clusters into private clusters
- Deleting or modifying the private endpoint in the customer subnet will cause the cluster to stop functioning.
Requirements
There are some requirements you need to complete before we can deploy Terraform modules using Azure DevOps.
- Store the Terraform state file to an Azure storage account. For more information on how to create to use a storage account to store remote Terraform state, state locking, and encryption at rest, see Store Terraform state in Azure Storage
- Create an Azure DevOps Project. For more information, see Create a project in Azure DevOps
- Create an Azure DevOps Service Connection to your Azure subscription. No matter you use Service Principal Authentication (SPA) or an Azure-Managed Service Identity when creating the service connection, make sure that the service principal or managed identity used by Azure DevOps to connect to your Azure subscription is assigned the owner role on the entire subscription.
Fix the routing issue
When you deploy an Azure Firewall into a hub virtual network and your private AKS cluster in a spoke virtual network, and you want to use the Azure Firewall to control the egress traffic using network and application rule collections, you need to make sure to properly configure the ingress traffic to any public endpoint exposed by any service running on AKS to enter the system via one of the public IP addresses used by the Azure Firewall. In order to route the traffic of your AKS workloads to the Azure Firewall in the hub virtual network, you need to create and associate a route table to each subnet hosting the worker nodes of your cluster and create a user-defined route to forward the traffic for 0.0.0.0/0
CIDR to the private IP address of the Azure firewall and specify Virtual appliance
as next hop type
. For more information, see Tutorial: Deploy and configure Azure Firewall using the Azure portal.
When you introduce an Azure firewall to control the egress traffic from your private AKS cluster, you need to configure the internet traffic to go throught one of the public Ip address associated to the Azure Firewall in front of the Public Standard Load Balancer used by your AKS cluster. This is where the problem occurs. Packets arrive on the firewall's public IP address, but return to the firewall via the private IP address (using the default route). To avoid this problem, create an additional user-defined route for the firewall's public IP address as shown in the picture below. Packets going to the firewall's public IP address are routed via the Internet. This avoids taking the default route to the firewall's private IP address.
For more information, see:
- Restrict egress traffic from an AKS cluster using Azure firewall
- Integrate Azure Firewall with Azure Standard Load Balancer
Terraform State
In order to deploy Terraform modules to Azure you can use Azure DevOps CI/CD pipelines. Azure DevOps provides developer services for support teams to plan work, collaborate on code development, and build and deploy applications and infrastructure components using IaC technologies such as ARM Templates, Bicep, and Terraform.
Terraform stores state about your managed infrastructure and configuration in a special file called state file. This state is used by Terraform to map real-world resources to your configuration, keep track of metadata, and to improve performance for large infrastructures. Terraform state is used to reconcile deployed resources with Terraform configurations. When using Terraform to deploy Azure resources, the state allows Terraform to know what Azure resources to add, update, or delete. By default, Terraform state is stored in a local file named "terraform.tfstate", but it can also be stored remotely, which works better in a team environment. Storing the state in a local file isn't ideal for the following reasons:
- Storing the Terraform state in a local file doesn't work well in a team or collaborative environment.
- Terraform state can include sensitive information.
- Storing state locally increases the chance of inadvertent deletion.
Each Terraform configuration can specify a backend, which defines where and how operations are performed, where state snapshots are stored. The Azure Provider or azurerm can be used to configure infrastructure in Microsoft Azure using the Azure Resource Manager API's. Terraform provides a backend for the Azure Provider that allows to store the state as a Blob with the given Key within a given Blob Container inside a Blob Storage Account. This backend also supports state locking and consistency checking via native capabilities of the Azure Blob Storage. When using Azure DevOps to deploy services to a cloud environment, you should use this backend to store the state to a remote storage account. For more information on how to create to use a storage account to store remote Terraform state, state locking, and encryption at rest, see Store Terraform state in Azure Storage. Under the storage-account folder in this sample, you can find a Terraform module and bash script to deploy an Azure storage account where you can persist the Terraform state as a blob.
Azure DevOps Self-Hosted Agent
If you plan to use Azure DevOps, you can't use Azure DevOps Microsoft-hosted agents to deploy your workloads to a private AKS cluster as they don't have access to its API server. In order to deploy workloads to your private SAKS cluster you need to provision and use an Azure DevOps self-hosted agent in the same virtual network of your private AKS cluster or in peered virtual network. In this latter case, make sure to the create a virtual network link between the Private DNS Zone of the AKS cluster in the node resource group and the virtual network that hosts the Azure DevOps self-hosted agent. You can deploy a single Windows or Linux Azure DevOps agent using a virtual machine, or use a virtual machine scale set (VMSS). For more information, see Azure virtual machine scale set agents. For more information, see:
As an alternative, you can set up a self-hosted agent in Azure Pipelines to run inside a Windows Server Core (for Windows hosts), or Ubuntu container (for Linux hosts) with Docker and deploy it as a pod with one or multiple replicas in your private AKS cluster. If the subnets hosting the node pools of your private AKS cluster are configured to route the egress traffic to an Azure Firewall via a route table and user-defined route, make sure to create the proper application and network rules to allow the agent to access external sites to download and install tools like Docker, kubectl, Azure CLI, and Helm to the agent virtual machine. For more informations, see Run a self-hosted agent in Docker and Build and deploy Azure DevOps Pipeline Agent on AKS.
The cd-self-hosted-agent pipeline in this sample deploys a self-hosted Linux agent as an Ubuntu Linux virtual machine in the same virtual network hosting the private AKS cluster. The pipeline uses a Terraform module under the agent folder to deploy the virtual machine. Make sure to specify values for the variables in the cd-self-hosted-agent and in the agent.tfvars. The following picture represents the network topology of Azure DevOps and self-hosted agent.
Variable Groups
The key-vault folder contains a bash script that uses Azure CLI to store the following data to an Azure Key Vault. This sensitive data will be used by Azure DevOps CD pipelines via variable groups. Variable groups store values and secrets that you want to pass into a YAML pipeline or make available across multiple pipelines. You can share use variables groups in multiple pipelines in the same project. You can Link an existing Azure key vault to a variable group and map selective vault secrets to the variable group. You can link an existing Azure Key Vault to a variable group and select which secrets you want to expose as variables in the variable group. For more information, see Link secrets from an Azure Key Vault.
The YAML pipelines in this sample use a variable group shown in the following picture:
The variable group is configured to use the following secrets from an existing Key Vault:
Variable | Description |
---|---|
terraformBackendContainerName | Name of the blob container holding the Terraform remote state |
terraformBackendResourceGroupName | Resource group name of the storage account that contains the Terraform remote state |
terraformBackendStorageAccountKey | Key of the storage account that contains the Terraform remote state |
terraformBackendStorageAccountName | Name of the storage account that contains the Terraform remote state |
sshPublicKey | Key used by Terraform to configure the SSH public key for the administrator user of the virtual machine and AKS worker nodes |
azureDevOpsUrl | Url of your Azure DevOps Organization (e.g. https://dev.azure.com/contoso) |
azureDevOpsPat | Personal access token used by an Azure DevOps self-hosted agent |
azureDevOpsAgentPoolName | Name of the agent pool of the Azure DevOps self-hosted agent |
Azure DevOps Pipelines
You can use Azure DevOps YAML pipelines to deploy resources to the target environment. Pipelines are part of the same Git repo that contains the artifacts such as Terraform modules and scripts and as such pipelines can be versioned as any other file in the Git reppsitory. You can follow a pull-request process to ensure changes are verified and approved before being merged. The following picture shows the key concepts of an Azure DevOps pipeline.
- A trigger tells a Pipeline to run.
- A pipeline is made up of one or more stages. A pipeline can deploy to one or more environments.
- A stage is a way of organizing jobs in a pipeline and each stage can have one or more jobs.
- Each job runs on one agent. A job can also be agentless.
- Each agent runs a job that contains one or more steps.
- A step can be a task or script and is the smallest building block of a pipeline.
- A task is a pre-packaged script that performs an action, such as invoking a REST API or publishing a build artifact.
- An artifact is a collection of files or packages published by a run.
For more information on Azure DevOps pipelines, see:
This sample provides three pipelines to deploy the infrastructure using Terraform modules, and one to undeploy the infrastructure.
Pipeline Name | Description |
---|---|
cd-validate-plan-apply-one-stage-tfvars | In Terraform, to set a large number of variables, you can specify their values in a variable definitions file (with a filename ending in either .tfvars or .tfvars.json ) and then specify that file on the command line with a -var-file parameter. For more information, see Input Variables. The sample contains three different .tfvars files under the tfvars folder. Each file contains a different value for each variable and can be used to deploy the same infrastructure to three distinct environment: production, staging, and test. |
cd-validate-plan-apply-one-stage-vars | This pipeline specifies variable values for Terraform plan and apply commands with the -var command line option. For more information, see Input Variables. |
cd-validate-plan-apply-separate-stages.yml | This pipeline is composed of three distinct stages for validate, plan, and apply. Each stage can be run separately. |
destroy-aks-deployment | This pipeline uses the destroy command to fully remove the resource group and all the Azure resources. |
cd-self-hosted-agent. | This pipeline can be used to deploy an Azure DevOps self-hosted agent as an Ubuntu virtual machine in the same subnet of the jump-box virtual machine. This deployment requires you to pass as a paramater the following information:
|
cd-redmine-via-helm | This pipeline can be used to deploy the Bitnami redmine project management web application using a Helm chart from ArtifactHub. This pipeline creates all the necessary Azure resources to front the Public IP of the Standard Load Balancer used by the service with the Azure Firewall in the Hub virtual network and expose the service with a hostname defined in an Azure public DNS zone. For more information, see: |
destroy-self-hosted-agent | This pipeline can be used to destroy the Azure DevOps self-hosted agent. |
destroy-redmine-via-helm | This pipeline can be used to uninstall the Bitnami redmine project management we application using a Helm chart and destroy all the Azure resources used to exposed the service via the Azure Firewall and the AKS cluster Standard Load Balancer. |
ci-test-web-app | This pipeline can be used to build the container image of the test web application and store it to an Azure Container Registry. In addition, the pipeline stores the Helm chart to another repository inside the registry. |
cd-test-web-app | This pipeline can be used to deploy the test web application using a Helm chart. This pipeline creates all the necessary Azure resources to front the Public IP of the Standard Load Balancer used by the service with the Azure Firewall in the Hub virtual network and expose the service with a hostname defined in an Azure public DNS zone. For more information, see: |
Terraform Extension for Azure DevOps
All the pipelines make use of the tasks of the Terraform extension. This extension provides the following components:
- A service connection for connecting to an Amazon Web Services(AWS) account
- A service connection for connecting to a Google Cloud Platform(GCP) account
- A task for installing a specific version of Terraform, if not already installed, on the agent
- A task for executing the core Terraform commands
The Terraform tool installer task acquires a specified version of Terraform from the Internet or the tools cache and prepends it to the PATH of the Azure Pipelines Agent (hosted or private). This task can be used to change the version of Terraform used in subsequent tasks. Adding this task before the Terraform task in a build definition ensures you are using that task with the right Terraform version.
The Terraform task enables running Terraform commands as part of Azure Build and Release Pipelines providing support for the following Terraform commands
This extension is intended to run on Windows, Linux and MacOS agents. As an alternative, you can use the [Bash Task](https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/utility/bash? view=azure-devops) or PowerShell Task to install Terraform to the agent and run Terraform commands.
Azure Resources
The following picture shows the resources deployed by the ARM template in the target resource group using one of the Azure DevOps pipelines in this reporitory.
The following picture shows the resources deployed by the ARM template in the MC resource group associated to the AKS cluster:
Use Azure Firewall in front of the Public Standard Load Balancer of the AKS cluster
Resource definitions in the Terraform modules make use of the lifecycle meta-argument to customize the actions when Azure resources are changed outside of Terraform control. The ignore_changes argument is used to instruct Terraform to ignore updates to given resource properties such as tags. The Azure Firewall Policy resource definition contains a lifecycle block to prevent Terraform from fixing the resource when a rule collection or a single rule gets created, updated, or deleted. Likewise, the Azure Route Table contains a lifecycle block to prevent Terraform from fixing the resource when a user-defined route gets created, deleted, or updated. This allows to manage the DNAT, Application, and Network rules of an Azure Firewall Policy and the user-defined routes of an Azure Route Table outside of Terraform control.
The cd-redmine-via-helm pipeline shows how you can deploy a workload to a private AKS cluster using an Azure DevOps Pipelines that runs on a Self-hosted Agent. The sample deploys the Bitnami redmine project management web application using a public Helm chart. The following diagram shows the network topology of the sample:
The message flow can be described as follows:
- A request for the AKS-hosted web application is sent to a public IP exposed by the Azure Firewall via a public IP configuration. Both the public IP and public IP configuration are dedicated to this workload.
- An Azure Firewall DNAT rule is used to to translate the Azure Firewall public IP address and port to the public IP and port used by the workload in the
kubernetes
public Standard Load Balancer of the AKS cluster in the node resource group. - The request is sent by the load balancer to one of the Kubernetes service pods running on one of the agent nodes of the AKS cluster.
- The response message is sent back to the original caller via a user-defined with the Azure Firewall public IP as address prefix and Internet as next hope type.
- Any workload-initiated outbound call is routed to the private IP address of the Azure Firewall by the default user-defined route with
0.0.0.0/0
as address prefix and virtual appliance as next hope type.
The cd-redmine-via-helm pipeline performs the following steps:
- The Helm Installer task installs Helm on the Self-hosted Agent: this step is not necessary if Helm is already installed on the agent
- The Kubectl Installer taks installs kubectl on the self-hosted agent: even this step is not necessary if kubectl is already installed on the agent
- Uses the Azure CLI task to run the following steps:
- Gets the AKS cluster credentials using the az aks get-credentials command
- Uses the Helm CLI to add the repo for the Bitnami redmine project management web application
- Uses Helm CLI to check if the Helm chart is already deployed:
- If yes, it upgrades the current release.
- If not, it installs a new release.
- Uses kubectl to retrieve the Public IP of the redmine service.
- Uses the az network public-ip show to check if an Azure Public IP resource called
AksName_HelmReleaseNamespace_ServiceName
already exists in a give resource group.- If yes, it retrieves its public IP address.
- If not, it creates a new Azure Public IP resource using the az network public-ip create and retrieves its public IP address.
- Uses the az network firewall ip-config show command to check if an Azure Firewall IP configuration named
AksName_HelmReleaseNamespace_ServiceName
already exists. If not, it creates a new Azure Firewall IP configuration using the az network firewall ip-config create command. - Uses the az network firewall policy rule-collection-group collection list command to check if a DNAT rule collection named
DnatRules
already exists in the Azure Firewall Policy. If not, it creates a new a DNAT rule collection namedDnatRules
under theDefaultDnatRuleCollectionGroup
rule collection group using the az network firewall policy rule-collection-group collection add-filter-collection command. - Uses the az network firewall policy rule-collection-group collection rule add command to add a DNAT rule to the Azure Firewall Policy that maps the port 80 of the
AksName_HelmReleaseNamespace_ServiceName
public IP address used by theAksName_HelmReleaseNamespace_ServiceName
Azure Firewall IP configuration to the port 80 of the public IP address exposed by the redmine service on the Standard Load Balancer of the private AKS cluster (in a production environment you should use port 443 and HTTPS transport protocol instead of port 80 and unsecure HTTP transport protocol). - Uses the az network route-table route show command to check if a user-defined route called exists in the Azure Route Table associated to the subnets hosting the node pools of the AKS cluster. If not, it creates a new user-defined route using the az network route-table route create command thats routes the traffic from the public IP address named
AksName_HelmReleaseNamespace_ServiceName
directly to internet. This route is more specific than the user-defined route with CIDR0.0.0.0/0
that routes the traffic from the subnets hosting AKS node pools to the private IP address of the Azure Firewall. This user-defined rule allows to properly send back response messages to the public IP address of the Azure Firewall Ip configuration used to expose the redmine Kubernetes service. - Uses the az network dns record-set a list command to check if an A record exists with the given subdomain for the AKS-hosted web application. If not, the pipeline uses the az network dns record-set a add-record command to an A record for the resolution of the service hostname to the public IP address of the Azure Firewall public IP.
Likewise, the destroy-redmine-via-helm pipeline shows how you can undeploy a workload to a private AKS cluster using an Azure DevOps Pipelines that runs on a Self-hosted Agent. The pipeline performs the following steps:
- The Helm Installer task installs Helm on the Self-hosted Agent: this step is not necessary if Helm is already installed on the agent
- The Kubectl Installer taks installs kubectl on the self-hosted agent: even this step is not necessary if kubectl is already installed on the agent
- Uses the Azure CLI task to run the following steps:
-
Gets the AKS cluster credentials using the az aks get-credentials command
-
Uses Helm CLI to uninstall the redmine release.
-
Uses kubectl to delete the Kubernetes namespace used by the release.
-
Uses the az network firewall policy rule-collection-group collection rule remove command to remove the DNAT rule called
AksName_HelmReleaseNamespace_ServiceName
from theDnatRules
rule collection of the Azure Firewall Policy. -
Uses the az network route-table route delete command to delete the user-defined route called
AksName_HelmReleaseNamespace_ServiceName
from the Azure Route Table associated to the subnets hosting the node pools of the AKDS cluster. -
Uses the az network firewall ip-config delete command to delete the Azure Firewall IP configuration called
AksName_HelmReleaseNamespace_ServiceName
used to expose the redmine Kubernetes service. -
Uses the az network public-ip delete command to destroy the Azure Public IP called
AksName_HelmReleaseNamespace_ServiceName
used to expose the redmine Kubernetes service.
-
API Gateway
In a production environment where Azure Firewall is used to inspect, protect, and filter inbound internet traffic with Azure Firewall DNAT rules and Threat intelligence-based filtering, it's a good practice to use an API Gateway to expose web applications and REST APIs to the public internet.
Without an API gateway, client apps should send requests directly to the Kubernetes-hosted microservices and this would raises the following problems:
- Coupling: client application are coupled to internal microservices. Refactoring internal microservices can cause breaking changes to the client apps. Introducing a level of indirection via an API Gateway between client apps and a SaaS application allows you to start breaking down the monolith and gradually replace subsystems with microservices without violating the contract with client apps.
- Chattiness: if a single page/screen needs to retrieve data from multiple microservices, this can result into multiple calls to fine-grained microservices
- Security issues: without an API Gateway, all the microservices are directly exposed to the public internet making the attack surface larger.
- Cross-cutting concerns: Each publicly exposed microservice must implement concerns such as authentication, authorization, SSL termination, client rate limiting, response caching, etc.
When running applications on AKS, you can use one of the following API Gateways:
- Reverse Proxy Server: Nginx, HAProxy, and Traefik are popular reverse proxy servers that support features such as load balancing, SSL termination, and layer 7 routing. They can run on dedicated virtual machines or as ingress controllers on a Kubernetes cluster.
- Service Mesh Ingress Controller: If you are using a service mesh such as Open Service Mesh, Linkerd, and Istio, consider the features that are provided by the ingress controller for that service mesh. For example, the Istio ingress controller supports layer 7 routing, HTTP redirects, retries, and other features.
- Azure Application Gateway: Azure Application Gateway is a regional, fully-managed load balancing service that can perform layer-7 routing and SSL termination. It also provides a Web Access Firewall and an ingress controller for Kubernetes. For more information, see Use Application Gateway Ingress Controller (AGIC) with a multi-tenant Azure Kubernetes Service.
- Azure Front Door: Azure Front Door is a global layer 7 load balancer that uses the Microsoft global edge network to create fast, secure, and widely scalable web applications. It supports features such as SSL termination, response caching, WAF at the edge, URL-based routing, rewrite and redirections, it support multiple routing methods such as priority routing and latency-based routing.
- Azure API Management: API Management is a turnkey solution for publishing APIs to both external and internal customers. It provides features that are useful for managing a public-facing API, including rate limiting, IP restrictions, and authentication and authorization using Azure Active Directory or other identity providers.
Use Azure Firewall in front of an internal Standard Load Balancer
In this scenario, an ASP.NET Core application is hosted as a service by an Azure Kubernetes Service cluster and fronted by an NGINX ingress controller. The application code is available under the source folder, while the Helm chart is available in the chart folder. The NGINX ingress controller is exposed via an internal load balancer with a private IP address in the spoke virtual network that hosts the AKS cluster. For more information, see Create an ingress controller to an internal virtual network in Azure Kubernetes Service (AKS). When you deploy an NGINX ingress controller or more in general a LoadBalancer
or ClusterIP
service with the service.beta.kubernetes.io/azure-load-balancer-internal: "true"
annotation in the metadata section, an internal standard load balancer called kubernetes-internal
gets created under the node resource group. For more information, see Use an internal load balancer with Azure Kubernetes Service (AKS). As shown in the picture below, the test web application is exposed via the Azure Firewall using a dedicated Azure public IP.
The message flow can be described as follows:
- A request for the AKS-hosted test web application is sent to a public IP exposed by the Azure Firewall via a public IP configuration. Both the public IP and public IP configuration are dedicated to this workload.
- An Azure Firewall DNAT rule is used to to translate the Azure Firewall public IP address and port to the private IP address and port used by the NGINX ingress conroller in the internal Standard Load Balancer of the AKS cluster in the node resource group.
- The request is sent by the internal load balancer to one of the Kubernetes service pods running on one of the agent nodes of the AKS cluster.
- The response message is sent back to the original caller via a user-defined with
0.0.0.0/0
as address prefix and virtual appliance as next hope type. - Any workload-initiated outbound call is routed to the private IP address of the user-defined route.
The ci-test-web-app pipeline performs the following steps:
- Uses the az acr login command to login to Azure Container Registry through the Docker CLI.
- Uses
docker build
anddocker push
commands to build and publish the container image to Azure Container Registry. - Uses
helm registry login
to login to Azure Container Registry via Helm - Uses
helm push
command to push the Helm chart as an Open Container Initiative (OCI) artifact.
The cd-test-web-app pipeline performs the following steps:
- The Helm Installer task installs Helm on the Self-hosted Agent: this step is not necessary if Helm is already installed on the agent
- The Kubectl Installer taks installs kubectl on the self-hosted agent: even this step is not necessary if kubectl is already installed on the agent
- Uses the Azure CLI task to run the following steps:
- Gets the AKS cluster credentials using the az aks get-credentials command
- Uses the Helm CLI to add the repo for the Bitnami redmine project management web application
- Uses Helm CLI to check if the Helm chart is already deployed:
- If yes, it upgrades the current release.
- If not, it installs a new release.
- Deploys the cert-manager via Helm chart. Cert-manager adds certificates and certificate issuers as resource types in Kubernetes clusters, and simplifies the process of obtaining, renewing and using those certificates. In this sample, cert-manager issues a certificate from Let's Encrypt used by the NGINX ingress controller for SSL termination.
- Deploys the NGINX ingress controller via Helm chart. The NGINX ingress controller is exposed via an internal load balancer with a private IP address in the spoke virtual network that hosts the AKS cluster. For more information, see Create an ingress controller to an internal virtual network in Azure Kubernetes Service (AKS). When you deploy an NGINX ingress controller or more in general a
LoadBalancer
orClusterIP
service with theservice.beta.kubernetes.io/azure-load-balancer-internal: "true"
annotation in the metadata section, an internal standard load balancer calledkubernetes-internal
gets created under the node resource group. For more information, see Use an internal load balancer with Azure Kubernetes Service (AKS). - Uses kubectl to retrieve the external, private IP of the NGINX ingress controller.
- Uses the az network public-ip show to check if an Azure Public IP resource called
AksName_HelmReleaseNamespace_ServiceName
already exists in a give resource group.- If yes, it retrieves its public IP address.
- If not, it creates a new Azure Public IP resource using the az network public-ip create and retrieves its public IP address.
- Uses the az network firewall ip-config show command to check if an Azure Firewall IP configuration named
AksName_HelmReleaseNamespace_ServiceName
already exists. If not, it creates a new Azure Firewall IP configuration using the az network firewall ip-config create command. - Uses the az network firewall policy rule-collection-group collection list command to check if a DNAT rule collection named
DnatRules
already exists in the Azure Firewall Policy. If not, it creates a new a DNAT rule collection namedDnatRules
under theDefaultDnatRuleCollectionGroup
rule collection group using the az network firewall policy rule-collection-group collection add-filter-collection command. - Uses the az network firewall policy rule-collection-group collection rule add command to add a DNAT rule to the Azure Firewall Policy that maps the port 80 of the
AksName_HelmReleaseNamespace_ServiceName
public IP address used by theAksName_HelmReleaseNamespace_ServiceName
Azure Firewall IP configuration to the port 80 of the private IP address exposed by the NGINX ingress controller on the Internal Load Balancer of the private AKS cluster. This rule is necessary to let Let's Encrypt to check that your are the owner of the domain specified in the ingress of your service when the cert-manager issues a certificate for SSL termination. - Uses the az network firewall policy rule-collection-group collection rule add command to add a DNAT rule to the Azure Firewall Policy that maps the port 443 of the
AksName_HelmReleaseNamespace_ServiceName
public IP address used by theAksName_HelmReleaseNamespace_ServiceName
Azure Firewall IP configuration to the port 443 of the private IP address exposed by the NGINX ingress controller on the Internal Load Balancer of the private AKS cluster. This rule is used to translate and send incoming requests to the NGINX ingress controller. - Uses the az network dns record-set a list command to check if an A record exists with the given subdomain for the AKS-hosted web application. If not, the pipeline uses the az network dns record-set a add-record command to an A record for the resolution of the service hostname to the public IP address of the Azure Firewall public IP.
Considerations
In a production environment, the endpoints publicly exposed by Kubernetes services running in a private AKS cluster should be exposed using an ingress controller such as NGINX Ingress Controller or Application Gateway Ingress Controller that provides advanced functionalities such as path based routing, load balancing, SSL termination, and web access firewall. For more information, see the following articles:
Azure Kubernetes Service
- Create a private Azure Kubernetes Service cluster
- Best practices for multitenancy and cluster isolation
- Best practices for basic scheduler features in Azure Kubernetes Service (AKS)
- Best practices for advanced scheduler features
- Best practices for authentication and authorization
- Best practices for cluster security and upgrades in Azure Kubernetes Service (AKS)
- Best practices for container image management and security in Azure Kubernetes Service (AKS)
- Best practices for network connectivity and security in Azure Kubernetes Service (AKS)
- Best practices for storage and backups in Azure Kubernetes Service (AKS)
- Best practices for business continuity and disaster recovery in Azure Kubernetes Service (AKS)
- Azure Kubernetes Services (AKS) day-2 operations guide
Azure Application Gateway
Azure Application Gateway Ingress Controller
- Use Application Gateway Ingress Controller (AGIC) with a multi-tenant Azure Kubernetes Service (AKS)
- What is Application Gateway Ingress Controller?
- Documentation for Application Gateway Ingress Controller
- Annotations for Application Gateway Ingress Controller
- Certificate issuance with LetsEncrypt.org
- Tutorial: Enable the Ingress Controller add-on (preview) for a new AKS cluster with a new Application Gateway instance
- Tutorial: Enable Application Gateway Ingress Controller add-on for an existing AKS cluster with an existing Application Gateway through Azure CLI (Preview)
- Difference between Helm deployment and AKS Add-On
NGINX Ingress Controller
- NGINX Ingress Controller documentation
- Enabling ModSecurity in the Kubernetes NGINX Ingress Controller
- Create an HTTPS ingress controller on Azure Kubernetes Service (AKS)
- Create an NGINX ingress controller that uses an internal, private network and IP address
- Create an NGINX ingress controller that uses your own TLS certificates
- Create an ingress controller that uses Let's Encrypt to automatically generate TLS certificates with a static public IP address
Azure Application Gateway WAF
- What is Azure Web Application Firewall on Azure Application Gateway?
- Web Application Firewall CRS rule groups and rules
- Custom rules for Web Application Firewall v2 on Azure Application Gateway
- Quickstart: Create an Azure WAF v2 on Application Gateway using an ARM template
- Microsoft.Network/ApplicationGatewayWebApplicationFirewallPolicies Resource Type
- Create and use Web Application Firewall v2 custom rules on Application Gateway
- az network application-gateway waf-policy Azure CLI commands
- Enable Web Application Firewall using the Azure CLI
- Configure per-site WAF policies using Azure PowerShell
- Create Web Application Firewall policies for Application Gateway
Related resources
Architectural guidance
- Azure Kubernetes Service (AKS) solution journey
- AKS cluster best practices
- Azure Kubernetes Services (AKS) day-2 operations guide
- Choosing a Kubernetes at the edge compute option
Reference architectures
- Baseline architecture for an Azure Kubernetes Service (AKS) cluster
- Microservices architecture on Azure Kubernetes Service (AKS)
- Advanced Azure Kubernetes Service (AKS) microservices architecture
- CI/CD pipeline for container-based workloads
- Building a telehealth system on Azure
Visio
In the visio folder you can find the Visio document which contains the above diagrams.
Test access to your private AKS cluster
If you open an ssh session to the Linux virtual machine via Azure Bastion and manually run the nslookup command using the fully-qualified name (FQDN) of the API server as a parameter, you should see an output like the the following:
NOTE: the Terraform module runs an Azure Custom Script Extension that installed the kubectl and Azure CLI on the jumpbox virtual machine.