• Stars
    star
    204
  • Rank 190,926 (Top 4 %)
  • Language
    HTML
  • License
    MIT License
  • Created over 4 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Airflow Deployment on AWS ECS Fargate Using Cloudformation

Airflow Autoscaling ECS

Setup to run Airflow in AWS ECS (Elastic Container Service) Fargate with autoscaling enabled for all services. All infrastructure is created with Cloudformation and Secrets are managed by AWS Secrets Manager.

Cloudformation Resources

Requirements

  • Create an AWS IAM User for the infrastructure deployment, with admin permissions
  • Install AWS CLI running pip install awscli
  • Install Docker
  • Setup your IAM User credentials inside ~/.aws/config
    [profile my_aws_profile]
    aws_access_key_id = <my_access_key_id> 
    aws_secret_access_key = <my_secret_access_key>
    region = us-east-1
  • Create a virtual environment
  • Setup env variables in your .zshrc or .bashrc, or in your the terminal session that you are going to use:
	export AWS_REGION=us-east-1;
	export AWS_PROFILE=my_aws_profile;
	export ENVIRONMENT=dev;

Deploy Airflow Locally

make airflow-local

Deploy Airflow on AWS ECS

To deploy or update your stack run the following command:

make airflow-deploy

To rebuild Airflow Docker Image and push it to ECR (without infrastructure changes), run:

make airflow-push-image

To destroy your stack run the following command:

make airflow-destroy

Update a Dag on AWS

After creating or updating a DAG you need to rebuild Airflow image, push it to ECR and then restart the airflow service. To do all that, you just need to execute:

make airflow-push-image

Features

  • Control all Airflow infrastructure from a single service.yml file.
  • Metadata DB Passwords Managed with AWS Secrets Manager.
  • Autoscaling enabled and configurable for all Airflow sub-services (workers, flower, webserver, scheduler)
  • TODO: Continuous Integration using AWS CodePipeline
  • TODO: Create isolated DAGs using docker_operator

Adjust many infrastructure configs directly on Service.yml:

  workers:
    port: 8793
    cpu: 1024
    memory: 2048
    desiredCount: 2
    autoscaling:
      maxCapacity: 8
      minCapacity: 2
      cpu:
        target: 70
        scaleInCooldown: 60
        scaleOutCooldown: 120
      memory:
        target: 70
        scaleInCooldown: 60
        scaleOutCooldown: 120

Access to Airflow UI:

Airflow UI

Look for AirflowWebServerEndpoint on outputs logged to your terminal.

    "cfn-airflow-webserver": [
        {
            "OutputKey": "AirflowWebServerEndpoint",
            "OutputValue": "airflow-dev-webserver-alb-1234567890.us-east-1.elb.amazonaws.com"
        }
    ],

Access to Flower UI:

Airflow UI

Look for AirflowFlowerEndpoint on outputs logged to your terminal.

    "cfn-airflow-flower": [
        {
            "OutputKey": "AirflowFlowerEndpoint",
            "OutputValue": "airflow-dev-flower-alb-1234567890.us-east-1.elb.amazonaws.com"
        }
    ],

Inspired by the work done by Nicor88

More Repositories

1

fake-web-events

Creates a Simulation of Fake Web Events
Python
79
star
2

bootcamp-engenharia-de-dados

Python
71
star
3

Job-Listing-Scraper

Scraps jobs listings from Glassdoor
Python
33
star
4

airflow-fargate-cdk

Deploy of Airflow 2.0 using ECS Fargate and AWS CDK.
Python
14
star
5

Criando-Lambda-Functions-para-Ingerir-Dados-de-APIs

Criando Lambda Functions para Ingerir Dados de APIs com AWS CDK
Python
13
star
6

bootcamp-turma-6

Conteúdo das aulas da turma 6 do bootcamp de engenharia de dados da How
Python
12
star
7

bootcamp-turma-5

Conteúdo da turma 5 do Bootcamp
Python
12
star
8

data-scientist-value

Flask app to calculate compensation of a data scientist
CSS
12
star
9

IngestaoMercadoBitcoin

Script para ingestão de dados do Mercado Bitcoin
Python
11
star
10

bootcamp-turma-5-data-platform

Data Platform Turma 5
Jupyter Notebook
10
star
11

bootcamp-turma-6-data-platform

Data Platform com AWS CDK
Python
10
star
12

bootcamp-turma-5-cdk

Deploy de aws cdk
Python
9
star
13

bootcamp-turma-4

Python
7
star
14

LIVE-002---Perca-o-medo-do-Git

Nessa live vemos desde o be-a-bá do git até como resolver conflitos e fazer rebases
7
star
15

data-universe-2022

Código usado na palestra para Data Universe em 08/08/2022
Python
7
star
16

kaggle-top20-predictor

Predicts if a Data Scientist earns more than USD100k per year
Python
6
star
17

supletivo-data-hackers-AWS-CDK

Conteúdo do supletivo de AWS CDK do Data Hackers
Python
4
star
18

bootcamp-data-platform-turma-3

Conteúdo do Bootcamp de Engenharia de Dados, Turma 3
Python
4
star
19

andresionek91

Profile
3
star
20

turma4_cdk

Python
3
star
21

portuguese-stop-words

Provides list of portuguese stop words. With or without accents.
Python
3
star
22

CorreiosPrecoPrazo

Correios Preços e Prazos - Python Wrapper
Python
3
star
23

bootcamp-data-platform

Data Platform para o Bootcamp de Engenharia de Dados
Python
2
star
24

Geographical-Analysis-of-Brazilian-E-Commerce

Geographical visualization of Brazilian e-commerce with the Olist public dataset
Jupyter Notebook
2
star
25

bootcamp-turma-6-dbt

2
star
26

bootcamp-turma-3

Bootcamp engenharia de dados
Python
2
star
27

bootcamp-data-platform-turma-4

Python
1
star
28

eng_de_dados_live

Python
1
star
29

dms-insert-update

Python
1
star
30

bootcamp-engenharia-de-dados-dbt

Conteúdo da aula sobre DBT
1
star
31

bootcamp_eng_dados_turma2

Python
1
star
32

live-ia-rio-preto

Python
1
star