ARCHIVED
As of November 3rd, 2020 this respository is now archived. Please consult the Dask Cloud docs page for more information on deploying Dask with cloud resources.
Dask EC2
Easily launch a cluster on Amazon EC2 configured with dask.distributed
,
Jupyter Notebooks, and Anaconda.
Installation
You also install dask-ec2
using pip:
$ pip install dask-ec2
You can also install dask-ec2
and its dependencies from the
conda-forge repository using
conda:
$ conda install dask-ec2 -c conda-forge
Usage
Note: dask-ec2
uses
boto3
to interact with
Amazon EC2. You can configure your AWS credentials using
Environment Variables
or Configuration Files.
The dask-ec2 up
command can be used to create and provision a cluster on Amazon EC2:
$ dask-ec2 up --help
Usage: dask-ec2 up [OPTIONS]
Options:
--keyname TEXT Keyname on EC2 console [required]
--keypair PATH Path to the keypair that matches the keyname
[required]
--name TEXT Tag name on EC2
--tags TEXT Additional EC2 tags. Comma separated K:V
pairs: K1:V1,K2:V2
--region-name TEXT AWS region [default: us-east-1]
--vpc-id TEXT EC2 VPC ID
--subnet-id TEXT EC2 Subnet ID on the VPC
--iaminstance-name TEXT IAM Instance Name
--ami TEXT EC2 AMI [default: ami-d05e75b8]
--username TEXT User to SSH to the AMI [default: ubuntu]
--type TEXT EC2 Instance Type [default: m3.2xlarge]
--count INTEGER Number of nodes [default: 4]
--security-group TEXT Security Group Name [default: dask-ec2-default]
--security-group-id TEXT Security Group ID (overwrites Security Group
Name)
--volume-type TEXT Root volume type [default: gp2]
--volume-size INTEGER Root volume size (GB) [default: 500]
--file PATH File to save the metadata [default:
cluster.yaml]
--provision / --no-provision Provision salt on the nodes [default: True]
--anaconda / --no-anaconda Bootstrap anaconda [default: True]
--dask / --no-dask Install Dask.Distributed in the cluster
[default: True]
--notebook / --no-notebook Start a Jupyter Notebook in the head node
[default: True]
--nprocs INTEGER Number of processes per worker [default: 1]
--source / --no-source Install Dask/Distributed from git master
[default: False]
-h, --help Show this message and exit.
The minimal required arguments for the dask-ec2 up
command are:
$ dask-ec2 up --keyname my_aws_key --keypair ~/.ssh/my_aws_key.pem
This will create a cluster.yaml
in the directory that it was executed, and
this file is required to use the other commands in the CLI.
Once a cluster is running, the dask-ec2
command can be used to create or destroy
a cluster, ssh into nodes, or other functions:
$ dask-ec2
Usage: dask-ec2 [OPTIONS] COMMAND [ARGS]...
Options:
--version Show the version and exit.
-h, --help Show this message and exit.
Commands:
anaconda Provision anaconda
dask-distributed dask.distributed option
destroy Destroy cluster
notebook Provision the Jupyter notebook
provision Provision salt instances
ssh SSH to one of the node. 0-index
up Launch instances