• Stars
    star
    249
  • Rank 162,987 (Top 4 %)
  • Language
    Kotlin
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Amazon's light-weight library for chaos engineering on AWS. It can be used for EC2 and ECS (with EC2 launch type).

Build Status Maven Central Javadoc

AWSSSMChaosRunner

AWSSSMChaosRunner is a library which simplifies failure injection testing and chaos engineering for EC2 and ECS (with EC2 launch type). It offers the following options for failure injection -

An in-depth introduction to this library and how Prime Video uses it can be found here - https://aws.amazon.com/blogs/opensource/building-resilient-services-at-prime-video-with-chaos-engineering/

Usage with AWS Systems Manager SendCommand

  1. Setup permissions for calling SSM from tests package

    This can be done in many different ways. The approach described here generates temporary credentials for AWS SSM on each run of the tests. To enable this the following are needed

    • An IAM role with the following permissions. (JSON snippet)
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Action": [
                      "sts:AssumeRole",
                      "ssm:CancelCommand",
                      "ssm:CreateDocument",
                      "ssm:DeleteDocument",
                      "ssm:DescribeDocument",
                      "ssm:DescribeInstanceInformation",
                      "ssm:DescribeDocumentParameters",
                      "ssm:DescribeInstanceProperties",
                      "ssm:GetDocument",
                      "ssm:ListTagsForResource",
                      "ssm:ListDocuments",
                      "ssm:ListDocumentVersions",
                      "ssm:SendCommand"
                  ],
                  "Resource": [
                      "*"
                  ],
                  "Effect": "Allow"
              },
              {
                  "Action": [
                      "ec2:DescribeInstances",
                      "iam:PassRole",
                      "iam:ListRoles"
                  ],
                  "Resource": [
                      "*"
                  ],
                  "Effect": "Allow"
              },
              {
                  "Action": [
                      "ssm:StopAutomationExecution",
                      "ssm:StartAutomationExecution",
                      "ssm:DescribeAutomationExecutions",
                      "ssm:GetAutomationExecution"
                  ],
                  "Resource": [
                      "*"
                  ],
                  "Effect": "Allow"
              }
          ]
      }
    • An IAM user which can assume the above role.
  2. Add AWSSSMChaosRunner maven dependency to your tests package

    <dependency>
      <groupId>software.amazon.awsssmchaosrunner</groupId>
      <artifactId>awsssmchaosrunner</artifactId>
      <version>1.3.0</version>
    </dependency> 
    
  3. Initialise the SSM Client (Kotlin snippet)

    @Bean
    open fun awsSecurityTokenService(
       credentialsProvider: AWSCredentialsProvider, 
       awsRegion: String
       ): AWSSecurityTokenService {
        return AWSSecurityTokenServiceClientBuilder.standard()
            .withCredentials(credentialsProvider)
            .withRegion(awsRegion)
            .build()
    }
    
    @Bean
    open fun awsSimpleSystemsManagement(
       securityTokenService: AWSSecurityTokenService,
       awsAccountId: String,
       chaosRunnerRoleName: String
       ): AWSSimpleSystemsManagement {
        val chaosRunnerRoleArn = "arn:aws:iam::$awsAccountId:role/$chaosRunnerRoleName"
        val credentialsProvider = STSAssumeRoleSessionCredentialsProvider
            .Builder(chaosRunnerRoleArn, "ChaosRunnerSession")
            .withStsClient(securityTokenService).build()
    
        return AWSSimpleSystemsManagementClientBuilder.standard()
            .withCredentials(credentialsProvider)
            .build()
    }
  4. Start the fault injection attack before starting the test and stop it after the test (Kotlin snippet)

    import software.amazon.awsssmchaosrunner.attacks.SSMAttack
    import software.amazon.awsssmchaosrunner.attacks.SSMAttack.Companion.getAttack
    ...
    
    @Before
    override fun initialise(args: Array<String>) {
        if (shouldExecuteChaosRunner()) {
            ssm = applicationContext.getBean(AWSSimpleSystemsManagement::class.java)
            ssmAttack = getAttack(ssm, attackConfiguration)
            command = ssmAttack.start()
        }
    }
    
    @After
    override fun destroy() {
        ssmAttack.stop(command)
    }
  5. Run the test

Usage with AWS FIS

  1. Setup permissions for calling SSM from tests package

    This can be done in many different ways. The approach described here generates temporary credentials for AWS SSM on each run of the tests. To enable this the following are needed

    • An IAM role with the following permissions. (JSON snippet)
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Action": [
                      "sts:AssumeRole",
                      "ec2:DescribeInstances",
                      "iam:ListRoles",
                      "ssm:ListCommands",
                      "ssm:SendCommand",
                      "ssm:CancelCommand",
                      "iam:PassRole",
                      "ec2:RebootInstances",
                      "ec2:StopInstances",
                      "ec2:StartInstances",
                      "ec2:TerminateInstances",
                      "fis:InjectApiInternalError",
                      "fis:InjectApiThrottleError",
                      "fis:InjectApiUnavailableError",
                      "fis:ListExperimentTemplates",
                      "fis:ListActions",
                      "fis:ListTargetResourceTypes",
                      "fis:ListExperiments",
                      "fis:GetTargetResourceType",
                      "fis:CreateExperimentTemplate",
                      "fis:DeleteExperimentTemplate",
                      "fis:StopExperiment",
                      "fis:StartExperiment"                    
                  ],
                  "Resource": [
                      "*"
                  ],
                  "Effect": "Allow"
              },
              {
                  "Action": [
                      "iam:CreateServiceLinkedRole"
                  ],
                  "Resource": [
                      "*"
                  ],
                  "Effect": "Allow",
                  "Conditions": {
                        "StringEquals": {
                             "iam:AWSServiceName": "fis.amazonaws.com"
                        }
                  }
              }
          ]
      }
    • An IAM user which can assume the above role.
  2. Add AWSSSMChaosRunner maven dependency to your tests package

    <dependency>
      <groupId>software.amazon.awsssmchaosrunner</groupId>
      <artifactId>awsssmchaosrunner</artifactId>
      <version>1.3.0</version>
    </dependency> 
    
  3. Initialise the FIS Client (Kotlin snippet)

    //Java code snippet
    String executionRoleArn = getenv("EXECUTION_ROLE_ARN");
    Region awsRegion = Region.of(getenv("AWS_REGION"));
    
    StsClient stsClient = StsClient.builder().build();
    StsAssumeRoleCredentialsProvider assumeRoleCredentialsProvider = StsAssumeRoleCredentialsProvider.builder()
           .refreshRequest(AssumeRoleRequest.builder()
                    .roleArn(executionRoleArn)
                    .roleSessionName("ChaosRunnerSession")
                    .build())
                    .stsClient(stsClient)
                    .build();
    FisClient fisClient = FisClient.builder().credentialsProvider(assumeRoleCredentialsProvider).region(awsRegion).build();
  4. Configure and execute the FIS failure injection

    String targetsSelectionMode = "ALL";
    String cloudWatchLogGroupArn = "";
    String stopConditionCloudWatchAlarmArn = "";
    String name = "IOStress"; // This failure injection consumes disk space
    String duration = "PT2M";
    Map<String, String> otherFailureInjectionParameters = Collections.emptyMap();
    
    FISAttack.Companion.AttackConfiguration attackConfiguration = new FISAttack.Companion.AttackConfiguration(targets,
            targetsSelectionMode,
            cloudWatchLogGroupArn,
            stopConditionCloudWatchAlarmArn,
            executionRoleArn);
    FISSendCommandAttack.Companion.ActionConfiguration actionConfiguration = new FISSendCommandAttack.Companion.ActionConfiguration(
            name,
            duration,
            awsRegion.toString(),
            otherFailureInjectionParameters
    );
    FISAttack fisAttack = FISSendCommandAttack.Companion.getAttack(fisClient, attackConfiguration, actionConfiguration);
    StartExperimentResponse experiment = fisAttack.start();
    ...
    ...
    boolean deleteExperimentTemplate = false;
    fisAttack.stop(experiment, deleteExperimentTemplate);

FAQs

  • What about Chaos-SSM-Documents (github repo) ?

    The idea for AWSSSMChaosRunner came from Chaos-SSM-Documents (and from medium post).

  • Why use AWS SSM ?

    In most cases EC2 fleets are already using the SSM Agent for OS patching, this library leverages this existing agent and reduces setup work needed for fault injection.

  • What failure injections are available ?

  • What about other failure injections ?

    You're welcome to send pull requests for other failure injections.

  • How is the failure injection rolled back ? / What if AWS SSM fails to stop the failure injection ?

    SSM is not actually used to stop/roll back the failure injection. The failure injection scripts first schedule the failure rollback (with at command) and then start the actual failure injection. This ensures that, barring special cases, the failure injection will be rolled back at a specified time in the future.

  • What languages does AWSSSMChaosRunner support ?

    AWSSSMChaosRunner can be used as a dependency from Kotlin, Java or Scala.

  • Is there a complete working demo of using this library ?

    A demonstration can be found in Demo.kt. To run the demo:

    • Clone this project.

    • Build the gradle project successfully (via gradle CLI or IDE).

    • Modify the awsProfile value to the awsProfile name for your AWS account.

    • Comment @Disabled() annotation.

    • Run the gradle test target.

  • Can AWSSSMChaosRunner be used for Amazon Elastic Container Service (ECS) ?

    Yes. The above EC2 usage steps should be followed after the SSM agent setups listed below.

    • ECS + EC2 launch type
      • SSM Agent setup

        The SSM Agent is required for using SSM SendCommand API and thus, for using AWSSSMChaosRunner. The base EC2 images include the SSM Agent, but the base ECS images do not. It can be installed directly at the host level. This can be achieved with the following CloudFormation snippet (YAML):

        # Adapted from https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/quickref-ecs.html
          LaunchConfiguration0:
            Type: AWS::AutoScaling::LaunchConfiguration
            Metadata:
              # This is processed by cfn-init in the Properties.UserData script below. It installs a
              # service that monitors for changes in the Metadata just below, causing a configuration
              # update.
              #
              # CloudFormation updates to the LaunchConfiguration's Properties won't take effect on
              # existing instances. Consequently, any CloudFormation field that could change should go in
              # the Metadata.
              AWS::CloudFormation::Init:
                config:
                  # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-init.html
                  packages:
                    rpm:
                      # The SSM (Systems Systems Manager) agent is necessary to use `aws ssm send-command`
                      # or 'Run Command' in the AWS-EC2 console. It's also required by InfoSec for our
                      # exception. The base EC2 images include it, but the base ECS images do not.
                      # https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-install-startup-linux.html
                      amazon-ssm-agent: !Sub https://s3.${AWS::Region}.amazonaws.com/amazon-ssm-${AWS::Region}/latest/linux_amd64/amazon-ssm-agent.rpm
      • Possible failure injections

        SSM SendCommand API will run the underlying failure injection commands directly on the EC2 host. This will affect all tasks running on these hosts. The EC2 + ECS host does not impose any additional restrictions regarding what resources can or can't be accessed. Thus, all AWSSSMChaosRunner attacks can be run on EC2 + ECS.

  • Can AWSSSMChaosRunner be used for AWS Lambda ?

    No.

More Repositories

1

style-dictionary

A build system for creating cross-platform styles.
JavaScript
3,880
star
2

computer-vision-basics-in-microsoft-excel

Computer Vision Basics in Microsoft Excel (using just formulas)
2,394
star
3

selling-partner-api-docs

This repository contains documentation for developers to use to call Selling Partner APIs.
1,543
star
4

smoke-framework

A light-weight server-side service framework written in the Swift programming language.
Swift
1,443
star
5

alexa-skills-kit-js

SDK and example code for building voice-enabled skills for the Amazon Echo.
1,134
star
6

ion-java

Java streaming parser/serializer for Ion.
Java
840
star
7

selling-partner-api-models

This repository contains OpenAPI models for developers to use when developing software to call Selling Partner APIs.
Mustache
590
star
8

sketch-constructor

Read/write/manipulate Sketch files in Node without Sketch plugins!
JavaScript
542
star
9

pecos

PECOS - Prediction for Enormous and Correlated Spaces
Python
509
star
10

amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
C
455
star
11

ion-js

A JavaScript implementation of Amazon Ion.
TypeScript
323
star
12

convolutional-handwriting-gan

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)
Python
265
star
13

xfer

Transfer Learning library for Deep Neural Networks.
Python
253
star
14

ion-python

A Python implementation of Amazon Ion.
Python
210
star
15

amazon-pay-sdk-php

Amazon Pay PHP SDK
PHP
209
star
16

kotlin-inject-anvil

Extensions for the kotlin-inject dependency injection framework
Kotlin
191
star
17

fire-app-builder

Fire App Builder is a framework for building java media apps for Fire TV, allowing you to add your feed of media content to a configuration file and build an app to browse and play it quickly.
Java
182
star
18

exoplayer-amazon-port

Official port of ExoPlayer for Amazon devices
Java
173
star
19

oss-dashboard

A dashboard for viewing many GitHub organizations at once.
Ruby
159
star
20

ion-c

A C implementation of Amazon Ion.
C
149
star
21

metalearn-leap

Original PyTorch implementation of the Leap meta-learner (https://arxiv.org/abs/1812.01054) along with code for running the Omniglot experiment presented in the paper.
Python
148
star
22

ion-go

A Go implementation of Amazon Ion.
Go
146
star
23

auction-gym

AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.
Jupyter Notebook
144
star
24

distance-assistant

Pedestrian monitor that provides visual feedback to help ensure proper social distancing guidelines are being observed
Python
135
star
25

hawktracer

HawkTracer is a highly portable, low-overhead, configurable profiling tool built in Amazon Video for getting performance metrics from low-end devices.
C++
133
star
26

trans-encoder

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
Python
133
star
27

smoke-aws

AWS services integration for the Smoke Framework
Swift
111
star
28

amazon-payments-magento-2-plugin

Extension to enable Amazon Pay on Magento 2
PHP
108
star
29

MXFusion

Modular Probabilistic Programming on MXNet
Python
103
star
30

amazon-weak-ner-needle

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Python
100
star
31

amazon-advertising-api-php-sdk

⛔️ DEPRECATED - Amazon Advertising API PHP Client Library
PHP
93
star
32

ads-advanced-tools-docs

Code samples and supplements for the Amazon Ads advanced tools center
Jupyter Notebook
91
star
33

ion-rust

Rust implementation of Amazon Ion
Rust
86
star
34

image-to-recipe-transformers

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
Python
81
star
35

oss-attribution-builder

The OSS Attribution Builder is a website that helps teams create attribution documents (notices, "open source screens", credits, etc) commonly found in software products.
TypeScript
80
star
36

smoke-http

Specialised HTTP Client for service operations abstracted from the HTTP protocol.
Swift
70
star
37

amazon-ray

Staging area for ongoing enhancements to Ray focused on improving integration with AWS and other Amazon technologies.
Python
66
star
38

alexa-coho

Sample code for building skill adapters for Alexa Connected Home using the Lighting API
JavaScript
62
star
39

amazon-pay-sdk-ruby

Amazon Pay Ruby SDK
Ruby
58
star
40

selling-partner-api-samples

Sample code for Amazon Selling Partner API use cases
Java
56
star
41

amazon-pay-sdk-python

Amazon Pay Python SDK
Python
53
star
42

amazon-pay-sdk-java

Amazon Pay Java SDK
Java
53
star
43

zero-shot-rlhr

Python
51
star
44

supply-chain-simulation-environment

Python
50
star
45

amazon-pay-api-sdk-php

Amazon Pay API SDK (PHP)
PHP
48
star
46

amazon-pay-sdk-csharp

Amazon Pay C# SDK
C#
47
star
47

ion-dotnet

A .NET implementation of Amazon Ion.
C#
47
star
48

multiconer-baseline

Python
47
star
49

zeek-plugin-enip

Zeek network security monitor plugin that enables parsing of the Ethernet/IP and Common Industrial Protocol standards
Zeek
44
star
50

amazon-pay-sdk-samples

Amazon Pay SDK Sample Code
PHP
43
star
51

oss-contribution-tracker

Track contributions made to external projects and manage CLAs
TypeScript
40
star
52

amazon-s3-gst-plugin

A collection of Amazon S3 GStreamer elements.
C
40
star
53

fashion-attribute-disentanglement

Python
39
star
54

zeek-plugin-s7comm

Zeek network security monitor plugin that enables parsing of the S7 protocol
Zeek
39
star
55

milan

Milan is a Scala API and runtime infrastructure for building data-oriented systems, built on top of Apache Flink.
Scala
39
star
56

orthogonal-additive-gaussian-processes

Light-weighted code for Orthogonal Additive Gaussian Processes
Python
38
star
57

jekyll-doc-project

This repository contains an open-source Jekyll theme for authoring and publishing technical documentation. This theme is used by Appstore/Alexa tech writers and other community members. Most of the theme's files are stored in a Ruby Gem (called jekyll-doc-project).
HTML
37
star
58

amazon-pay-api-sdk-nodejs

Amazon Pay API SDK (Node.js)
JavaScript
36
star
59

smoke-dynamodb

SmokeDynamoDB is a library to make it easy to use DynamoDB from Swift-based applications, with a particular focus on usage with polymorphic database tables (tables that do not have a single schema for all rows).
Swift
34
star
60

chalet-charging-location-for-electric-trucks

Optimization tool to identify charging locations for electric trucks
Python
34
star
61

sparse-vqvae

Experimental implementation for a sparse-dictionary based version of the VQ-VAE2 paper
Python
31
star
62

ss-aga-kgc

Python
31
star
63

amazon-pay-api-sdk-java

Amazon Pay API SDK (Java)
Java
30
star
64

zeek-plugin-bacnet

Zeek network security monitor plugin that enables parsing of the BACnet standard building controls protocol
Zeek
29
star
65

credence-to-causal-estimation

A framework for generating complex and realistic datasets for use in evaluating causal inference methods.
Python
29
star
66

basis-point-sets

Python
28
star
67

buy-with-prime-cdk-constructs

This package extends common CDK constructs with opinionated defaults to help create an organization strategy around infrastructure as code.
TypeScript
28
star
68

zeek-plugin-profinet

Zeek network security monitor plugin that enables parsing of the Profinet protocol
Zeek
27
star
69

differential-privacy-bayesian-optimization

This repo contains the underlying code for all the experiments from the paper: "Automatic Discovery of Privacy-Utility Pareto Fronts"
Python
26
star
70

ion-tests

Test vectors for testing compliant Ion implementations.
25
star
71

ion-hive-serde

A Apache Hive SerDe (short for serializer/deserializer) for the Ion file format.
Java
24
star
72

zeek-plugin-tds

Zeek network security monitor plugin that enables parsing of the Tabular Data Stream (TDS) protocol
Zeek
24
star
73

smoke-framework-application-generate

Code generator to generate SmokeFramework-based applications from service models.
Swift
24
star
74

ion-intellij-plugin

Support for Ion in Intellij IDEA.
Kotlin
23
star
75

ion-schema-kotlin

A Kotlin reference implementation of the Ion Schema Specification.
Kotlin
23
star
76

ftv-livetv-sample-tv-app

Java
23
star
77

emukit-playground

A web page explaining concepts of statistical emulation and making decisions under uncertainty in an interactive way.
JavaScript
22
star
78

ion-hash-go

A Go implementation of Amazon Ion Hash.
Go
22
star
79

pretraining-or-self-training

Codebase for the paper "Rethinking Semi-supervised Learning with Language Models"
Python
22
star
80

smoke-framework-examples

Sample applications showing the usage of the SmokeFramework and related libraries.
Swift
21
star
81

confident-sinkhorn-allocation

Pseudo-labeling for tabular data
Jupyter Notebook
21
star
82

tiny-attribution-generator

A small tool and library to create attribution notices from various formats
TypeScript
20
star
83

smoke-aws-generate

Code generator to generate the SmokeAWS library from service models.
Swift
19
star
84

ion-docs

Source for the GitHub Pages for Ion.
Java
19
star
85

autotrail

AutoTrail is a highly modular, partial automation workflow engine providing run time execution control
Python
19
star
86

smoke-aws-credentials

A library to obtain and assume automatically rotating AWS IAM roles written in the Swift programming language.
Swift
19
star
87

amazon-codeguru-profiler-for-spark

A Spark plugin for CPU and memory profiling
Java
18
star
88

git-commit-template

Set commit templates for git
JavaScript
18
star
89

service-model-swift-code-generate

Modular code generator to generate Swift applications from service models.
Swift
18
star
90

amazon-pay-api-sdk-dotnet

Amazon Pay API SDK (.NET)
C#
18
star
91

sample-fire-tv-app-video-skill

This sample Fire TV app shows how to integrate an Alexa video skill in a simple, basic way.
Java
16
star
92

amazon-template-library

A collection of general purpose C++ utilities that play well with the Standard Library and Boost.
C++
16
star
93

rheoceros

Cloud-based AI / ML workflow and data application development framework
Python
16
star
94

ion-cli

Rust
15
star
95

refuel-open-domain-qa

Python
15
star
96

amazon-instant-access-sdk-php

PHP SDK to aid in 3p integration with Instant Access
PHP
14
star
97

amazon-mcf-plugin-for-magento-1

Plugin code to enable Amazon MCF in Magento 1.
PHP
14
star
98

login-with-amazon-wordpress

A pre-integrated plugin that can be installed into a Wordpress powered website to integrate with Login with Amazon.
PHP
14
star
99

firetv-sample-touch-app

This sample Android project demonstrates how to build the main UI of a Fire TV application in order to support both Touch interactions and Remote D-Pad controls.
Java
14
star
100

amzn-ec2-ena-utilities

Python
14
star