Chaos QoaLa
ChaosQoaLa is a chaos engineering tool for injecting failure into JavaScript backed GraphQL end points. ChaosQoaLa can be used to add latency, and/or knock out specific data sections of a GraphQL server's responses. The extent of the chaos effect can be controlled via a "blast radius" configuration parameter so only a specified portion of responses are affected. There are three fundamental components. The first component is a piece of named middleware you will integrate into your GraphQL server implementation. This "Agent" sits in the server and will cause chaos on demand when it is sent instructions to do so. The "Controller" is a CLI tool which is installed on a test user's personal machine/VM and can be used to configure, run, and stop a chaos experiment. The "Site" (www.chaosqoala.io) can be used to upload results files generated by the Controller for visualization.
Supported GraphQL Servers
- Express GraphQL
- Apollo Server Express
Installation
Agent
# npm install chaosqoaloa-agent
Once installed in your server's codebase the agent can be integrated in 3 lines of code (between the koalas).
Express Graph QL example:
Apollo Server Express example:
Controller
Install on each test users' machine.
# npm install chaosqoaloa
Steady State Integration
A key metric in chaos engineering experiments is 'steady state'. Steady state is a measure of how well a system is performing. Different systems will be measured by different metrics, for example Netflix uses 'number of plays' to evaluate how well their systems or service is performing. ChaosQoaLa requires access to an endpoint that can be called when the chaos experiment starts, and again when the experiment ends. The end point must be stateful and respond to the stop invocation with steady state data points for the duration of the test run. The results are expected to be in a JSON array in the format:
[ { "timeOfResult": "2019-09-10T04:34:22.290Z", "result": 97 },
{ "timeOfResult": "2019-09-10T04:34:22.390Z", "result": 79 },
{.... ]
Using ChaosQoaLa
Configuring a test run
To configure a test run enter the following command on a user's machine with the Controller installed
# chaosqoala configure
The CLI tool will display a series of questions:
Please enter the URI of the Socket.io port over which Chaos will be sent and received
By default the Agent listens on port 1025 of the GraphQL server in which it has been installed. Enter the fully qualified path of port 1025 on the GraphQL server, for example https://gql.example.com:1025
Please enter the URI of GraphQL service
Enter the fully qualified path of the GraphQL server, for example https://gql.example.com
Please enter your desired blast radius
Enter a number between 0.0 and 1.0 to represent the % of responses that will be affected by chaos, for example a figure of 0.33 will result in 33% of responses from the server being subjected to failure injection.
Please enter the amount of time you would like your data to be delayed (in milliseconds)
Enter the number of milliseconds of latency to introduce to responses chosen for failure injection, use a value of zero in order to switch off latency injection.
Please enter how long you would like the ChaosQoala to run (in minutes).
Enter the duration of the test run in minutes. Test runs can be terminated at any point by pressing any key in the Controller CLI during test execution.
Please enter the steady state start URL
Enter the fully qualified URI for the steady state service you have implemented, this URI will be invoked when the chaos experiment is started. For example: https://api.example.com/steadystate/start
Please enter the steady state start HTTP verb
Enter the http verb used to invoke the steady state start route, GET/POST/PUT etc.
Please enter the steady state stop URL
Enter the fully qualified URI for the steady state service you have implemented, this URI will invoked when the chaos experiment is stopped. For example: https://api.example.com/steadystate/stop
Please enter the steady state stop HTTP verb
Enter the http verb used to invoke the steady state stop route, eg GET/POST/PUT etc.
Configuring query knockout
Once configure has been ran the Controller will write the configuration to its package.json. The Controller also inspects the GraphQL end point entered and extracts a list of available queries. To knock out data for a GraphQL query during an experiment just toggle the booleans in the affectedQueries object of the package.json:
"chaosConfig": {
. . .
"affectedQueries": {
"dontKnockMeOut": false,
"knockMeOut": true
},
},
Running an experiment
Once you are happy with the configuration the experiment can be started with the start command
# chaosqoala start
When the experiment ends a results file will be generated on the machine the Controller is running on - this file (timestamp_results.json) can be visualized on the ChaosQoaLa site via the upload link.
License
This project is licensed under the MIT License.
Authors
- Jacob Ory: github.com/jakeory
- Simon Maharai: github.com/Simon-IM
- Nicolas Venegas Parker: github.com/nicvhub
- Samantha Wessel: github.com/sw8wm2013
Acknowledgements
Huge thanks to Natalie Klein linkedin.com/in/nataliesklein for all the input, advice, and mentorship