• Stars
    star
    325
  • Rank 129,350 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 8 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces

Airflow REST API Plugin

Description

A plugin for Apache Airflow that exposes REST endpoints for the Command Line Interfaces listed in the airflow documentation:

http://airflow.apache.org/cli.html

The plugin also includes other custom REST APIs.

System Requirements

  • Airflow Versions
    • 1.X

Dependencies

  • If you are using the latest plugin release v1.0.8 with Airflow v1.10.3 or lower then you need to manually install flask_jwt_extended module.

Deployment Instructions

  1. Create the plugins folder if it doesn't exist.

    • The location you should put it is usually at {AIRFLOW_HOME}/plugins. The specific location can be found in your airflow.cfg file:

      plugins_folder = /home/{USER_NAME}/airflow/plugins

  2. Download the release code you want to deploy

    • Releases Available:

      • v0.0.1
      • v0.0.2
      • v1.0.0
      • v1.0.1
      • v1.0.2
      • v1.0.3
      • v1.0.4
      • v1.0.5
      • v1.0.6
      • v1.0.7
      • v1.0.8
      • v1.0.9
    • Branches Available:

      • master
      • v0.0.2-branch
      • v1.0.0-branch
      • v1.0.1-branch
      • v1.0.2-branch
      • v1.0.3-branch
      • v1.0.4-branch
      • v1.0.5-branch
      • v1.0.6-branch
      • v1.0.7-branch
      • v1.0.8-branch
      • v1.0.9-branch
    • URL to Download From:

      https://github.com/teamclairvoyant/airflow-rest-api-plugin/archive/{RELEASE_VERSION_OR_BRANCH_NAME}.zip

    • Note: Each release/branch has its own README.md file that describes the specific options and steps you should take to deploy and configure. Verify the options available in each release/branch after you download it.

  3. Unzip the file and move the contents of the plugins folder into your Airflow plugins directory

     unzip airflow-rest-api-plugin-{RELEASE_VERSION_OR_BRANCH_NAME}.zip
    
     cp -r airflow-rest-api-plugin-{RELEASE_VERSION_OR_BRANCH_NAME}/plugins/* {AIRFLOW_PLUGINS_FOLDER} 
    
  4. Update the base_url configuration in the airflow.cfg file to the public host or ip of the machine the Airflow instance is running on (Optional)

    • This is required for the API admin page to display the correct host
  5. (Optional) Append the following content to the end of the {AIRFLOW_HOME}/airflow.cfg file to give you control over execution:

     [rest_api_plugin]
     
     # Logs global variables used in the REST API plugin when the plugin is loaded. Set to False by default to avoid too many logging messages.
     # DEFAULT: False
     log_loading = False
     
     # Filters out loading messages from the standard out 
     # DEFAULT: True
     filter_loading_messages_in_cli_response = True
     
     # HTTP Header Name to be used for authenticating REST calls for the REST API Plugin
     # DEFAULT: 'rest_api_plugin_http_token'
     #rest_api_plugin_http_token_header_name = rest_api_plugin_http_token
        
     # HTTP Token  to be used for authenticating REST calls for the REST API Plugin
     # DEFAULT: None
     # Comment this out to disable Authentication
     #rest_api_plugin_expected_http_token = changeme
    
  6. (Optional) Setup Authentication for Security

    a. Note: Requires that step #5 above be completed.

    b. Follow the "Enabling Authentication" section below.

  7. We need to replace the CSRF_ENABLED attribute with WTF_CSRF_ENABLED. This change is required to support the POST method when RBAC is enabled with JWT. Please add the following property in the {AIRFLOW_HOME}/webserver_config.py file.

     # Flask-WTF flag for CSRF
     WTF_CSRF_ENABLED = False
    
  8. Restart the Airflow Web Server

Enabling Authentication

The REST API client supports a simple token based authentication mechanism where you can require users to pass in a specific http header to authenticate. By default this authentication mechanism is disabled but can be enabled with the "Setup" steps bellow.

Setup

  1. Edit your {AIRFLOW_HOME}/airflow.cfg file

    a. Under the [rest_api_plugin] section you added in step #5 of the "Deployment Instructions", uncomment the following configs:

     rest_api_plugin_http_token_header_name = rest_api_plugin_http_token
    
     rest_api_plugin_expected_http_token = changeme
    
  2. (Optional) Update the 'rest_api_plugin_http_token_header_name' to the desired value

  3. Change the value of 'rest_api_plugin_expected_http_token' to the desired token people should pass

  4. Restart the Airflow Web Server

Authenticating

Once the steps above have been followed to enable authentication, users will need to pass a specific header along with their request to properly call the REST API. The header name is: rest_api_plugin_http_token

Example CURL Command:

curl --header "rest_api_plugin_http_token: changeme" http://{HOST}:{PORT}/admin/rest_api/api?api=version

What happens when you fail to Authenticate?

In the event that you have authentication enabled and the user calling the REST Endpoint doesn't include the header, you will get the following response:

{
  "call_time": "{TIMESTAMP}",
  "http_response_code": 403,
  "output": "Token Authentication Failed",
  "response_time": "{TIMESTAMP}",
  "status": "ERROR"
}

RBAC Support

Plugin supports RBAC feature for Airflow versions 1.10.4 or higher.

  • Set rbac = True in airflow.cfg config file under webserver category.
  • Run airflow initdb command which generates a new config file webserver_config.py under the AIRFLOW_HOME.
  • This file contains the default settings for working with RBAC
  • You should see a new Security menu in the Airflow UI

Note: Once you enable the RBAC support, it will create new set of Database Tables. So, you need to manually migrate your existing users from users table to the new table ab_users. While migrating you need to specify the role for each user. Please check Create User for more details.

Enabling the plugin UI support under RBAC

Plugin works out of the box. If you need to access the plugin UI please follow the below steps.

  • After installing the plugin you will see few dynamic permissions generated by RBAC
  • Goto List roles menu item under Security menu
  • Create a new Role or edit an existing role and add menu access on Rest API Plugin and menu access on Admin permissions
  • Add user(s) to the Role in the previous step
  • Restart the Airflow webserver
  • Login to the Airflow and now you should see the REST API Plugin link under Admin menu.

Working with JWT Auth tokens

Plugin enables JWT Token based authentication for Airflow versions 1.10.4 or higher when RBAC support is enabled.

Generating the JWT access token

curl -XPOST http://{AIRFLOW_HOST}:{AIRFLOW_PORT}/api/v1/security/login -H "Content-Type: application/json" 
-d '{"username":"username", "password":"password", "refresh":true, "provider": "db"}' 

sample response which includes access_token and refresh_token.

{
    "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE1Njc3NzMxODcsIm5iZiI6MTU2Nzc3MzE4NywianRpIjoiYzFiOWRkNTgtNGY5OC00MTI5LTk1OTctMDc4YWJkOGM4ZjdkIiwiZXhwIjoxNTY3NzczNDg3LCJpZGVudGl0eSI6NiwiZnJlc2giOnRydWUsInR5cGUiOiJhY2Nlc3MifQ.pD1rCXt2NmjRlHxuoiLcZJ1RNUuHS_hV2Dudsqw6wCY",
    "refresh_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE1Njc3NzMxODcsIm5iZiI6MTU2Nzc3MzE4NywianRpIjoiNmZlZTQ1MmItYzBkYy00NTc4LWFmZWItNTc3OTAwODAzN2FiIiwiZXhwIjoxNTcwMzY1MTg3LCJpZGVudGl0eSI6NiwidHlwZSI6InJlZnJlc2gifQ.Gjcr08qrDM4wCmKIJS8muJN_5AkGyYZfOQKH_HUKPqE"
}

By default, JWT access token is valid for 15 mins and refresh token is valid for 30 days. You can renew the access token with the help of refresh token as shown below.

Renewing the Access Token

curl -XPOST "http://{AIRFLOW_HOST}:{AIRFLOW_PORT}/api/v1/security/refresh" -H 'Authorization: Bearer <refresh_token>'

sample response returns the renewed access token as shown below.

{
    "refresh_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE1Njc3NzI4NzUsIm5iZiI6MTU2Nzc3Mjg3NSwianRpIjoiMjE3Y2FhYzItYTQ1NS00MWMxLWI5OGUtOTRhYWU0OTcyZTljIiwiZXhwIjoxNTY3NzczMTc1LCJpZGVudGl0eSI6NSwiZnJlc2giOmZhbHNlLCJ0eXBlIjoiYWNjZXNzIn0.NoHaqc9caqk2B_cfutGO3U3Ih95lF0m9hvTPs5B_sE0"
}

Working with the rest_api_plugin and JWT Auth tokens

Pass the additional Authorization:Bearer <access_token> header in the rest API request

curl -H 'Authorization:Bearer <access_token>' 'http://{AIRFLOW_HOST}:{AIRFLOW_PORT}/admin/rest_api/api?api=version'  

If you have rest_api_plugin_http_token_header authentication enabled then you need to pass both headers as shown below.

curl -H 'Authorization:Bearer <access_token>'  -H 'rest_api_plugin_http_token: changeme' http://{AIRFLOW_HOST}:{AIRFLOW_PORT}/admin/rest_api/api?api=version

If JWT Authorization header is missing in the request you will get the following response. (Status Code: 401)

{ "msg": "Missing Authorization Header" }

If JWT access token expires you will get the following response. (Status Code: 401)

{ "msg": "Token has expired" }

If JWT access token is invalid you will get the following response. (Status Code: 422)

{ "msg": "Invalid payload padding" }

Please refer Flask-JWT-Extended module documentation for more details

Using the REST API

Once you deploy the plugin and restart the web server, you can start to use the REST API. Bellow you will see the endpoints that are supported. In addition, you can also interact with the REST API from the Airflow Web Server. When you reload the page, you will see a link under the Admin tab called "REST API". Clicking on the link will navigate you to the following URL:

http://{AIRFLOW_HOST}:{AIRFLOW_PORT}/admin/rest_api/

This web page will show the Endpoints supported and provide a form for you to test submitting to them.

version

Gets the version of Airflow currently running. Supports both http GET and POST methods.

Available in Airflow Version: 1.0.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=version

Query Arguments:

None

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=version

rest_api_plugin_version

Displays the version of this REST API Plugin you're using. Supports both http GET and POST methods.

Available in Airflow Version: None - Custom API

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=rest_api_plugin_version

Query Arguments:

None

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=rest_api_plugin_version

render

Render a task instance's template(s). Supports both http GET and POST methods.

Available in Airflow Version: 1.7.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=render

Query Arguments:

* dag_id - string - The id of the dag
     
* task_id - string - The id of the task

* execution_date - string - The execution date of the DAG (Example: 2017-01-02T03:04:05)

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=render

http://{HOST}:{PORT}/admin/rest_api/api?api=render&dag_id=value&task_id=value&execution_date=2017-01-02T03:04:05&subdir=value

variables

CRUD operations on variables. Supports both http GET and POST methods.

Available in Airflow Version: 1.7.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=variables

Query Arguments:

* set (optional) - Sets a variable. This requires passing the `cmd`, `key` and `value` parameters. Please see the example below.
    * cmd - string - Only allowed value is `cmd=set`
    * key - string - name of the variable
    * value - string - value of the variable
    
* get (optional) - string - Get value of a variable
     
* json (optional) - boolean - Deserialize JSON variable

* default (optional) - string - Default value returned if variable does not exist

* import (optional) - string - Import variables from JSON file

* export (optional) - string - Export variables from JSON file

* delete (optional) - string - Delete a variable

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=variables

http://{HOST}:{PORT}/admin/rest_api/api?api=variables&cmd=set&key=value&value=value&get=value&json&default=value&import=value&export=value&delete=value

For setting a variable like myVar1=myValue1 use

http://{HOST}:{PORT}/admin/rest_api/api?api=variables&cmd=set&key=myVar1&value=myValue1

connections

List/Add/Delete connections. Supports both http GET and POST methods.

Available in Airflow Version: 1.8.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=connections

Query Arguments:

* list (optional) - boolean - List all connections

* add (optional) - string - Add a connection

* delete (optional) - boolean - Delete a connection

* conn_id (optional) - string - Connection id, required to add/delete a connection

* conn_uri (optional) - string - Connection URI, required to add a connection

* conn_type (optional) - string - Connection type, required to add a connection without conn_uri

* conn_extra (optional) - string - Connection 'Extra' field, optional when adding a connection

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=connections

pause

Pauses a DAG. Supports both http GET and POST methods.

Available in Airflow Version: 1.7.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=pause

Query Arguments:

* dag_id - string - The id of the dag

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=pause&dag_id=test_id

unpause

Resume a paused DAG. Supports both http GET and POST methods.

Available in Airflow Version: 1.7.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=unpause

Query Arguments:

* dag_id - string - The id of the dag

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=unpause&dag_id=test_id

task_failed_deps

Returns the unmet dependencies for a task instance from the perspective of the scheduler. In other words, why a task instance doesn't get scheduled and then queued by the scheduler, and then run by an executor). Supports both http GET and POST methods.

Available in Airflow Version: 1.8.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=task_failed_deps

Query Arguments:

* dag_id - string - The id of the dag
     
* task_id - string - The id of the task

* execution_date - string - The execution date of the DAG (Example: 2017-01-02T03:04:05)

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=task_failed_deps&dag_id=value&task_id=value&execution_date=2017-01-02T03:04:05

trigger_dag

Triggers a Dag to Run. Supports both http GET and POST methods.

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=trigger_dag

Available in Airflow Version: 1.6.0 or greater

Query Arguments:

* dag_id - The DAG ID of the DAG you want to trigger
     
* run_id (Optional) - The RUN ID to use for the DAG run

* conf (Optional) - Some configuration to pass to the DAG you trigger - (URL Encoded JSON)

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=trigger_dag&dag_id=test_id

http://{HOST}:{PORT}/admin/rest_api/api?api=trigger_dag&dag_id=test_id&run_id=run_id_2016_01_01&conf=%7B%22key%22%3A%22value%22%7D

test

Test a task instance. This will run a task without checking for dependencies or recording it's state in the database.

Available in Airflow Version: 0.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=test

Query Arguments:

* dag_id - string - The id of the dag
     
* task_id - string - The id of the task

* execution_date - string - The execution date of the DAG (Example: 2017-01-02T03:04:05)

* subdir (optional) - string - File location or directory from which to look for the dag

* dryrun (optional) - boolean - Perform a dry run

* task_params (optional) - string - Sends a JSON params dict to the task

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=test&dag_id=value&task_id=value&execution_date=2017-01-02T03:04:05

dag_state

Get the status of a dag run. Supports both http GET and POST methods.

Available in Airflow Version: 1.8.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=dag_state

Query Arguments:

* dag_id - string - The id of the dag

* execution_date - string - The execution date of the DAG (Example: 2017-01-02T03:04:05)

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=dag_state&dag_id=test_id&execution_date=2017-01-02T03:04:05

run

Run a single task instance. Supports both http GET and POST methods.

Available in Airflow Version: 1.0.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=run

Query Arguments:

* dag_id - string - The id of the dag
     
* task_id - string - The id of the task

* execution_date - string - The execution date of the DAG (Example: 2017-01-02T03:04:05)

* subdir (optional) - string - File location or directory from which to look for the dag

* mark_success (optional) - boolean - Mark jobs as succeeded without running them

* force (optional) - boolean - Ignore previous task instance state, rerun regardless if task already succeede

* pool (optional) - boolean - Resource pool to use

* cfg_path (optional) - boolean - Path to config file to use instead of airflow.cfg

* local (optional) - boolean - Run the task using the LocalExecutor

* ignore_all_dependencies (optional) - boolean - Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_depsstore_true

* ignore_dependencies (optional) - boolean - Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies

* ignore_depends_on_past (optional) - boolean - Ignore depends_on_past dependencies (but respect upstream dependencies)

* ship_dag (optional) - boolean - Pickles (serializes) the DAG and ships it to the worker

* pickle (optional) - string - Serialized pickle object of the entire dag (used internally)

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=run&dag_id=value&task_id=value&execution_date=2017-01-02T03:04:05

list_tasks

List the tasks within a DAG. Supports both http GET and POST methods.

Available in Airflow Version: 0.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=list_tasks

Query Arguments:

* dag_id - string - The id of the dag
     
* tree (optional) - boolean - Tree view

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=list_tasks&dag_id=test_id

backfill

Run subsections of a DAG for a specified date range. Supports both http GET and POST methods.

Available in Airflow Version: 0.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=backfill

Query Arguments:

* dag_id - string - The id of the dag

* task_regex (optional) - string - The regex to filter specific task_ids to backfill (optional)

* start_date (optional) - string - Override start_date YYYY-MM-DD. Either this or the end_date needs to be provided.

* end_date (optional) - string - Override end_date YYYY-MM-DD. Either this or the start_date needs to be provided.

* mark_success (optional) - boolean - Mark jobs as succeeded without running them

* local (optional) - boolean - Run the task using the LocalExecutor

* donot_pickle (optional) - boolean - Do not attempt to pickle the DAG object to send over to the workers, just tell the workers to run their version of the code.

* include_adhoc (optional) - boolean - Include dags with the adhoc argument.

* ignore_dependencies (optional) - boolean - Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies

* ignore_first_depends_on_past (optional) - boolean - Ignores depends_on_past dependencies for the first set of tasks only (subsequent executions in the backfill DO respect depends_on_past)

* subdir (optional) - string - File location or directory from which to look for the dag

* pool (optional) - string - Resource pool to use

* dry_run (optional) - boolean - Perform a dry run

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=backfill&dag_id=test_id

list_dags

List all the DAGs. Supports both http GET and POST methods.

Available in Airflow Version: 0.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=list_dags

Query Arguments:

* subdir (optional) - string - File location or directory from which to look for the dag

* report (optional) - boolean - Show DagBag loading report

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=list_dags

kerberos

Start a kerberos ticket renewer. Supports both http GET and POST methods.

Available in Airflow Version: 1.6.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=kerberos

Query Arguments:

* principal (optional) - string - kerberos principal

* keytab (optional) - string - keytab

* pid (optional) - string - PID file location

* daemon (optional) - boolean - Daemonize instead of running in the foreground

* stdout (optional) - string - Redirect stdout to this file

* stderr (optional) - string - Redirect stderr to this file

* log-file (optional) - string - Location of the log file

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=kerberos

worker

Start a Celery worker node. Supports both http GET and POST methods.

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=worker

Available in Airflow Version: 0.1 or greater

Query Arguments:

* do_pickle (optional) - boolean - Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.

* queues (optional) - string - Comma delimited list of queues to serve

* concurrency (optional) - string - The number of worker processes

* pid (optional) - string - PID file location

* daemon (optional) - boolean - Daemonize instead of running in the foreground

* stdout (optional) - string - Redirect stdout to this file

* stderr (optional) - string - Redirect stderr to this file

* log-file (optional) - string - Location of the log file

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=worker

flower

Start a Celery worker node. Supports both http GET and POST methods.

Available in Airflow Version: 1.0.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=flower

Query Arguments:

* hostname (optional) - string - Set the hostname on which to run the server

* port (optional) - string - The port on which to run the server

* flower_conf (optional) - string - Configuration file for flower

* broker_api (optional) - string - Broker api

* pid (optional) - string - PID file location

* daemon (optional) - boolean - Daemonize instead of running in the foreground

* stdout (optional) - string - Redirect stdout to this file

* stderr (optional) - string - Redirect stderr to this file

* log-file (optional) - string - Location of the log file

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=flower

scheduler

Start a scheduler instance. Supports both http GET and POST methods.

Available in Airflow Version: 1.0.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=scheduler

Query Arguments:

* dag_id (optional) - string - The id of the dag

* subdir (optional) - string - File location or directory from which to look for the dag

* run-duration (optional) - string - Set number of seconds to execute before exiting  

* num_runs (optional) - string -  Set the number of runs to execute before exiting

* do_pickle (optional) - boolean - Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.

* pid (optional) - string - PID file location

* daemon (optional) - boolean - Daemonize instead of running in the foreground

* stdout (optional) - string - Redirect stdout to this file

* stderr (optional) - string - Redirect stderr to this file

* log-file (optional) - string - Location of the log file

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=scheduler

task_state

Get the status of a task instance. Supports both http GET and POST methods.

Available in Airflow Version: 1.0.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=task_state

Query Arguments:

* dag_id - string - The id of the dag
     
* task_id - string - The id of the task

* execution_date - string - The execution date of the DAG (Example: 2017-01-02T03:04:05)

* subdir (optional) - string - File location or directory from which to look for the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=task_state&dag_id=value&task_id=value&execution_date=2017-01-02T03:04:05

pool

CRUD operations on pools. Supports both http GET and POST methods.

Available in Airflow Version: 1.8.0 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=pool

Query Arguments:

* set (optional) - Set pool name, slot count and description respectively. This requires passing the `cmd`, `pool_name`, `slot_count` and `description` parameters. Please see the example below.
    * cmd - string - Only allowed value is `cmd=set`
    * pool_name - string - name of the Pool
    * slot_count - number - no.of slots in the Pool
    * description - string - description of the Pool

* get (optional) - string - Get pool info

* delete (optional) - string - Delete a pool

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=pool

http://{HOST}:{PORT}/admin/rest_api/api?api=pool&cmd=set&pool_name=value&slot_count=value&description=value&get=value&delete=value

For setting a myTestpool with a slot count of 10 and with myTestpoolDescription description.

http://{HOST}:{PORT}/admin/rest_api/api?api=pool&cmd=set&pool_name=myTestpool&slot_count=10&description=myTestpoolDescription

serve_logs

Serve logs generate by worker. Supports both http GET and POST methods.

Available in Airflow Version: 0.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=serve_logs

Query Arguments:

None

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=serve_logs

clear

Clear a set of task instance, as if they never ran. Supports both http GET and POST methods.

Available in Airflow Version: 0.1 or greater

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=clear

Query Arguments:

* dag_id - string - The id of the dag

* task_regex (optional) - string - The regex to filter specific task_ids to backfill (optional)

* start_date (optional) - string - Override start_date YYYY-MM-DD

* end_date (optional) - string - Override end_date YYYY-MM-DD

* subdir (optional) - string - File location or directory from which to look for the dag

* upstream (optional) - boolean - Include upstream tasks

* downstream (optional) - boolean - Include downstream tasks

* no_confirm (optional) - boolean - Do not request confirmation

* only_failed (optional) - boolean - Only failed jobs

* only_running (optional) - boolean - Only running jobs

* exclude_subdags (optional) - boolean - Exclude subdags

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=clear

deploy_dag

Deploy a new DAG. Supports both http GET and POST methods.

Available in Airflow Version: None - Custom API

POST - http://{HOST}:{PORT}/admin/rest_api/api?api=deploy_dag

POST Body Arguments:

  • dag_file - file - Python file to upload and deploy

  • force (optional) - boolean - Whether to forcefully upload the file if the file already exists or not

  • pause (optional) - boolean - The DAG will be forced to be paused when created and override the 'dags_are_paused_at_creation' config.

  • unpause (optional) - boolean - The DAG will be forced to be unpaused when created and override the 'dags_are_paused_at_creation' config.

Examples:

Header: multipart/form-data

URL: http://{HOST}:{PORT}/admin/rest_api/api?api=deploy_dag

Body: dag_file=path_to_file&force=on

CURL Example:

curl -X POST -H 'Content-Type: multipart/form-data' -F 'dag_file=@/path/to/dag.py' -F 'force=on' http://{HOST}:{PORT}/admin/rest_api/api?api=deploy_dag

refresh_dag

Refresh a DAG. Supports both http GET and POST methods.

Available in Airflow Version: None - Custom API

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=refresh_dag

Query Arguments:

* dag_id - string - The id of the dag

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=refresh_dag&dag_id=test_id

refresh_all_dags

Refresh all DAGs. Supports both http GET and POST methods.

Available in Airflow Version: <= None - Custom API

GET - http://{HOST}:{PORT}/admin/rest_api/api?api=refresh_all_dags

Query Arguments:

None

Examples:

http://{HOST}:{PORT}/admin/rest_api/api?api=refresh_all_dags

API Response

The API's will all return a common response object. It is a JSON object with the following entries in it:

  • airflow_cmd - String - Airflow CLI command being ran on the local machine
  • arguments - Dict - Dictionary with the arguments you passed in and their values
  • post_arguments - Dict - Dictionary with the post body arguments you passed in and their values
  • call_time - Timestamp - Time in which the request was received by the server
  • output - String - Text output from calling the CLI function
  • response_time - Timestamp - Time in which the response was sent back by the server
  • status - String - Response Status of the call. (possible values: OK, ERROR)
  • warning - String - A Warning message that's sent back from the API
  • http_response_code - Integer - HTTP Response code

Sample (Result of calling the versions endpoint)

{
  "airflow_cmd": "airflow version",
  "arguments": {},
  "call_time": "Tue, 29 Nov 2016 14:22:26 GMT",
  "http_response_code": 200,
  "output": "1.7.0",
  "response_time": "Tue, 29 Nov 2016 14:27:59 GMT",
  "status": "OK"
}

More Repositories

1

airflow-maintenance-dags

A series of DAGs/Workflows to help maintain the operation of Airflow
Python
1,670
star
2

airflow-scheduler-failover-controller

A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability
Python
232
star
3

hadoop-deployment-bash

Code for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.
Shell
34
star
4

apache-airflow-cloudera-csd

CSD for Apache Airflow
Shell
20
star
5

airflow_demo

Airflow script for incremental data import from Mysql to Hive using Sqoop.
Java
18
star
6

apache-airflow-cloudera-parcel

Parcel for Apache Airflow
Dockerfile
17
star
7

jenkins-workspace-cleanup-groovy-script

Jenkins Workspace Cleanup script to automate folders clean up for all the jobs
Groovy
16
star
8

airflow-user-management-plugin

A plugin for Apache Airflow that allows you to manage the users that can login
Python
14
star
9

hadoop-smoke-tests

Basic smoke tests to determine component functionality of a Hadoop cluster.
8
star
10

terraform-hadoop-talk

Set up the AWS infrastructure for a small Hadoop cluster as well as install the Cloudera Manager server and agents.
HCL
6
star
11

airflow-plugins

A series of Plugins for Apache Airflow (https://airflow.incubator.apache.org/)
Python
5
star
12

intro-to-spark

Java
3
star
13

NameDatabases

List of public, open source Name Databases
3
star
14

cdp-azure

Bits and pieces to make it easy to set up CDP on Azure
HCL
2
star
15

MongoDB_OPSLOG

Python
2
star
16

SparkCluster_Ansible

Shell
2
star
17

database-comparison-tool

Java
2
star
18

spark-streaming-workshop

Java
2
star
19

nagios-plugins

Plugins built for Nagios
Python
2
star
20

saleor-storefront-poc

Customizing Saleor storefront to add more features and evaluate.
TypeScript
2
star
21

clairthon-ambivalent-aardvarks

TypeScript
1
star
22

rabbitmq-cloudera-parcel

RabbitMQ parcel to be deployed and managed through Cloudera Manager
Python
1
star
23

skills-base

Java
1
star
24

spark-workshop-2x

Java
1
star
25

automated-hadoop-smoke-test

Basic smoke tests to determine component functionality of a Hadoop cluster.
Shell
1
star
26

data-scalaxy-test-util

A scala library that provides additional utilities for testing spark applications.
Scala
1
star
27

spark-batch

Template repository for spark-batch
Java
1
star
28

GCP-serv

1
star
29

restonomer

Framework to ingest data from REST APIs, transform and persist the data.
Scala
1
star
30

snowflake-poc

Snowflake PoC
1
star
31

minimal-ai

Framework to automate ETL pipeline creation with a touch of AI.
Python
1
star
32

vagrant-sparkbuilder

Simple environment to help rebuild Cloudera's Apache Spark.
Puppet
1
star
33

auto-etl

Python
1
star
34

IntroToMachineLearning

Intro to machine learning - Code for article at http://blog.clairvoyantsoft.com/2015/03/intro-to-machine-learning/
Python
1
star