• Stars
    star
    265
  • Rank 149,592 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 5 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Collection of AWS SSM Documents to perform Chaos Engineering experiments

Chaos Injection for AWS resources using Amazon SSM Run Command and Automation

Issues Maintenance Twitter

Collection of SSM Documents.

These documents let you perform chaos engineering experiments on resources (applications, network, and infrastructure) in the AWS Cloud.

SSM Automation documents:

To use SSM Automation, check the link

  • Support for (randomly) stopping EC2 instances via API
  • Support for (randomly) stopping EC2 instances via AWS Lambda
  • Support for (randomly) terminating EC2 instances via API
  • Support for detaching EBS volumes from EC2 instances via API (ec2, ebs)
  • Support for rebooting RDS instance with proper tags via API
  • Support for CPU stress scenario via Run Command

Upload an SSM Automation document:

aws ssm create-document --name "StopRandomInstances-API" --content file://stop-random-instance-api.yml --document-type "Automation" --document-format YAML

Upload all of the SSM Documents to the AWS region of your choice

cd chaos-ssm-documents/run-command

./upload-document.sh -r eu-west-1 (or other region of your choice)

SSM Run Command documents:

To use SSM Run Command, please check this link

Support Canceling & Rollback (10s max)

  • Support for killing a process by name using kill-process.yml
  • Support for CPU stress using cpu-stress.yml
  • Support for IO stress using io-stress.yml
  • Support for memory stress using memory-stress.yml
  • Support for diskspace stress using diskspace-stress.yml
  • Support for latency injection to network traffic on a particular network interface using latency-stress.yml
  • Support for latency injection with jitter to outgoing or incoming traffic from a configurable list of sources (Supported: IPv4, IPv4/CIDR, Domain name, DYNAMODB|S3) using latency-stress-sources.yml
  • Support for packet loss injection to network traffic on a particular network interface using network-loss-stress.yml
  • Support for packet loss injection to outgoing or incoming traffic from a configurable list of sources (Supported: IPv4, IPv4/CIDR, Domain name, DYNAMODB|S3) using network-loss-sources.yml

Experimental

  • Support for blackhole S3 stress using blackhole-s3-stress.yml
  • Support for blackhole DynamoDB stress using blackhole-dynamo-stress.yml
  • Support for blackhole EC2 stress using blackhole-ec2-stress.yml

Prerequisites

Upload one document at a time

cd chaos-ssm-documents/automation

aws ssm create-document --content file://cpu-stress.yml --name "cpu-stress" --document-type "Command" --document-format YAML

Upload all of the SSM Documents to the AWS region of your choice

cd chaos-ssm-documents/run-command

./upload-document.sh -r eu-west-1 (or other region of your choice)

SOME WORDS OF CAUTION BEFORE YOU START BREAKING THINGS:

  • To begin with, DO NOT use these chaos injection commands in production blindly!!
  • Always review the SSM documents and the commands in them.
  • Make sure your first chaos injections are done in a test environment and on test instances where no real and paying customer can be affected.
  • Test, test, and test more. Remember that chaos engineering is about breaking things in a controlled environment and through well-planned experiments to build confidence in your application — and you own tools — to withstand turbulent conditions.

More Repositories

1

aws-lambda-chaos-injection

Chaos Injection library for AWS Lambda
Python
99
star
2

aws-chaos-scripts

DEPRECATED Collection of python scripts to run failure injection on AWS infrastructure
Python
92
star
3

aws-lambda-sam-application

This project contains Python source code and supporting files for a serverless application that you can deploy with the SAM CLI and that uses CodeDeploy and Lambda traffic shifting for deployment. This is a demo for my immutable infrastructure talk.
Python
65
star
4

operational-excellence

Collection of templates related to operational excellence. Currently available Operational Readiness Review template and a Correction-of-Error (postmortem) template.
64
star
5

aws-fis-templates-cdk

Collection of AWS Fault Injection Simulator (FIS) experiment templates deploy-able via the AWS CDK
TypeScript
53
star
6

logtoes

Demo of Asynchronous pattern (worker) using Python Flask & Celery
Python
48
star
7

aws-lambda-layer-chaos-injection

AWS Lambda Layers to inject latency into AWS Lambda Functions
Python
48
star
8

poliko

Demo web applications that use AWS Artificial Intelligence services Rekognition and Polly (http://poliko.adhorn.me)
JavaScript
37
star
9

ssh-restricted

SSH-Restricted deploys an SSH compliance rule (AWS Config) with auto-remediation via AWS Lambda if SSH access is public.
Python
30
star
10

rasp-sensehat-iot

Sending data from Raspberry Pi with Sensehat sensor to AWS IoT
Python
21
star
11

eleanor

Code used during my Chaos Engineering and Resiliency Patterns talk.
Python
14
star
12

serverless-multi-region-app

Python
10
star
13

aws-fis-experiment-templates

Collection of AWS Fault Injection Simulator (FIS) experiment templates. These templates let you perform chaos engineering experiments on resources (applications, network, and infrastructure) in the AWS Cloud.
Python
9
star
14

kinesis-walkthrough-code

Code used during my talks on Kinesis Stream and Kinesis Firehose
Python
2
star
15

example-lambda-xray-py

Example Lambda Function running Python and X-ray using the https://github.com/racker/fleece/
Python
1
star