• Stars
    star
    337
  • Rank 120,587 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Snapshot Tool for Amazon RDS automates the task of creating manual snapshots, copying them into a different account and a different region, and deleting them after a specified number of days

Snapshot Tool for Amazon RDS

The Snapshot Tool for RDS automates the task of creating manual snapshots, copying them into a different account and a different region, and deleting them after a specified number of days. It also allows you to specify the backup schedule (at what times and how often) and a retention period in days. This version will work with all Amazon RDS instances except Amazon Aurora. For a version that works with Amazon Aurora, please visit the Snapshot Tool for Amazon Aurora.

IMPORTANT Run the Cloudformation templates on the same region where your RDS instances run (both in the source and destination accounts). If that is not possible because AWS Step Functions is not available, you will need to use the SourceRegionOverride parameter explained below.

Getting Started

Building From Source and Deploying

You will need to build from source and deploy to your own bucket in your own account. To build, you need to be on a unix-like system (e.g., macOS or some flavour of Linux) and you need to have make and zip.

  1. Create an S3 bucket to hold the Lambda function zip files. The bucket must be in the same region where the Lambda functions will run. And the Lambda functions must run in the same region as the RDS instances.

  2. Clone the repository

  3. Edit the Makefile file and set S3DEST to be the bucket name where you want the functions to go. Set the AWSARGS, AWSCMD and ZIPCMD variables as well.

  4. Type make at the command line. It will call zip to make the zip files, and then it will call aws s3 cp to copy the zip files to the bucket you named.

  5. Be sure to use the correct bucket name in the CodeBucket parameter when launching the stack in both accounts.

To deploy on your accounts, you will need to use the Cloudformation templates provided.

  • Deploy snapshot_tool_rds_source.json in the source account (the account that runs the RDS instances)
  • Deploy snapshot_tool_rds_dest.json in the destination account (the account where you'd like to keep your snapshots)

Source Account

Components

The following components will be created in the source account:

  • 3 Lambda functions (TakeSnapshotsRDS, ShareSnapshotsRDS, DeleteOldSnapshotsRDS)
  • 3 State Machines (Amazon Step Functions) to trigger execution of each Lambda function (stateMachineTakeSnapshotRDS, stateMachineShareSnapshotRDS, stateMachineDeleteOldSnapshotsRDS)
  • 3 Cloudwatch Event Rules to trigger the state functions
  • 3 Cloudwatch Alarms and associated SNS Topics to alert on State Machines failures
  • A Cloudformation stack containing all these resources

Installing in the source account

Run snapshot_tool_RDS_source.json on the Cloudformation console. You wil need to specify the different parameters. The default values will back up all RDS instances in the region at 1AM UTC, once a day. If your instances are encrypted, you will need to provide access to the KMS Key to the destination account. You can read more on how to do that here: https://aws.amazon.com/premiumsupport/knowledge-center/share-cmk-account/

Here is a break down of each parameter for the source template:

  • BackupInterval - how many hours between backup

  • BackupSchedule - at what times and how often to run backups. Set in accordance with BackupInterval. For example, set BackupInterval to 8 hours and BackupSchedule 0 0,8,16 * * ? * if you want backups to run at 0, 8 and 16 UTC. If your backups run more often than BackupInterval, snapshots will only be created when the latest snapshot is older than BackupInterval. If you set BackupInterval to more than once a day, make sure to adjust BackupSchedule accordingly or backups will only be taken at the times specified in the CRON expression.

  • InstanceNamePattern - set to the names of the instances you want this tool to back up. You can use a Python regex that will be searched in the instance identifier. For example, if your instances are named prod-01, prod-02, etc, you can set InstanceNamePattern to prod. The string you specify will be searched anywhere in the name unless you use an anchor such as ^ or $. In most cases, a simple name like "prod" or "dev" will suffice. More information on Python regular expressions here: https://docs.python.org/2/howto/regex.html

  • DestinationAccount - the account where you want snapshots to be copied to

  • LogLevel - The log level you want as output to the Lambda functions. ERROR is usually enough. You can increase to INFO or DEBUG.

  • RetentionDays - the amount of days you want your snapshots to be kept. Snapshots created more than RetentionDays ago will be automatically deleted (only if they contain a tag with Key: CreatedBy, Value: Snapshot Tool for RDS)

  • ShareSnapshots - Set to TRUE if you are sharing snapshots with a different account. If you set to FALSE, StateMachine, Lambda functions and associated Cloudwatch Alarms related to sharing across accounts will not be created. It is useful if you only want to take backups and manage the retention, but do not need to copy them across accounts or regions.

  • SourceRegionOverride - if you are running RDS on a region where Step Functions is not available, this parameter will allow you to override the source region. For example, at the time of this writing, you may be running RDS in Northern California (us-west-1) and would like to copy your snapshots to Montreal (ca-central-1). Neither region supports Step Functions at the time of this writing so deploying this tool there will not work. The solution is to run this template in a region that supports Step Functions (such as North Virginia or Ohio) and set SourceRegionOverride to us-west-1. IMPORTANT: deploy to the closest regions for best results.

  • CodeBucket - this parameter specifies the bucket where the code for the Lambda functions is located. The Lambda function code is located in the lambda directory in zip format. These files need to be on the *root of the bucket or the CloudFormation templates will fail. Please follow the instructions to build source (earlier on this README file)

  • DeleteOldSnapshots - Set to TRUE to enable functionality that will delete snapshots after RetentionDays. Set to FALSE if you want to disable this functionality completely. (Associated Lambda and State Machine resources will not be created in the account). WARNING If you decide to enable this functionality later on, bear in mind it will delete all snapshots, older than RetentionDays, created by this tool; not just the ones created after DeleteOldSnapshots is set to TRUE.

  • TaggedInstance - Set to TRUE to enable functionality that will only take snapshots for RDS Instances with tag CopyDBSnapshot set to True. The settings in InstanceNamePattern and TaggedInstance both need to evaluate successfully for a snapshot to be created (logical AND).

Destination Account

Components

The following components will be created in the destination account:

  • 2 Lambda functions (CopySnapshotsDestRDS, DeleteOldSnapshotsDestRDS)
  • 2 State Machines (Amazon Step Functions) to trigger execution of each Lambda function (stateMachineCopySnapshotsDestRDS, stateMachineDeleteOldSnapshotsDestRDS)
  • 2 Cloudwatch Event Rules to trigger the state functions
  • 2 Cloudwatch Alarms and associated SNS Topics to alert on State Machines failures
  • A Cloudformation stack containing all these resources

On your destination account, you will need to run snapshot_tool_RDS_dest.json on the Cloudformation. As before, you will need to run it in a region where Step Functions is available. The following parameters are available:

  • DestinationRegion - the region where you want your snapshots to be copied. If you set it to the same as the source region, the snapshots will be copied from the source account but will be kept in the source region. This is useful if you would like to keep a copy of your snapshots in a different account but would prefer not to copy them to a different region.
  • SnapshotPattern - similar to InstanceNamePattern. See above
  • DeleteOldSnapshots - Set to TRUE to enable functionanility that will delete snapshots after RetentionDays. Set to FALSE if you want to disable this functionality completely. (Associated Lambda and State Machine resources will not be created in the account). WARNING If you decide to enable this functionality later on, bear in mind it will delete ALL SNAPSHOTS older than RetentionDays created by this tool, not just the ones created after DeleteOldSnapshots is set to TRUE.
  • CrossAccountCopy - if you only need to copy snapshots across regions and not to a different account, set this to FALSE. When set to false, the no-x-account version of the Lambda functions will be deployed and will expect snapshots to be in the same account as they run.
  • KmsKeySource KMS Key to be used for copying encrypted snapshots on the source region. If you are copying to a different region, you will also need to provide a second key in the destination region.
  • KmsKeyDestination KMS Key to be used for copying encrypted snapshots to the destination region. If you are not copying to a different region, this parameter is not necessary.
  • RetentionDays - as in the source account, the amount of days you want your snapshots to be kept. Do not set this parameter to a value lower than the source account. Snapshots created more than RetentionDays ago will be automatically deleted (only if they contain a tag with Key: CopiedBy, Value: Snapshot Tool for RDS)

How it Works

There are two sets of Lambda Step Functions that take regular snapshots and copy them across. Snapshots can take time, and they do not signal when they're complete. Snapshots are scheduled to begin at a certain time using CloudWatch Events. Then different Lambda Step Functions run periodically to look for new snapshots. When they find new snapshots, they do the sharing and the copying functions.

In the Source Account

A CloudWatch Event is scheduled to trigger Lambda Step Function State Machine named stateMachineTakeSnapshotsRDS. That state machine invokes a function named lambdaTakeSnapshotsRDS. That function triggers a snapshot and applies some standard tags. It matches RDS instances using a regular expression on their names.

There are two other state machines and lambda functions. The statemachineShareSnapshotsRDS looks for new snapshots created by the lambdaTakeSnapshotsRDS function. When it finds them, it shares them with the destination account. This state machine is, by default, run every 10 minutes. (To change it, you need to change the ScheduleExpression property of the cwEventShareSnapshotsRDS resource in snapshots_tool_rds_source.json). If it finds a new snapshot that is intended to be shared, it shares the snapshot.

The other state machine is the statemachineDeleteOldSnapshotsRDS and it calls lambdaDeleteOldSnapshotsRDS to delete snapshots according to the RetentionDays parameter when the stack is launched. This state machine is, by default, run once each hour. (To change it, you need to change the ScheduleExpression property of the cwEventDeleteOldSnapshotsRDS resource in snapshots_tool_rds_source.json). If it finds a snapshot that is older than the retention time, it deletes the snapshot.

In the Destination Account

There are two state machines and corresponding lambda functions. The statemachineCopySnapshotsDestRDS looks for new snapshots that have been shared but have not yet been copied. When it finds them, it creates a copy in the destination account, encrypted with the KMS key that has been stipulated. This state machine is, by default, run every 10 minutes. (To change it, you need to change the ScheduleExpression property of the cwEventCopySnapshotsRDS resource in snapshots_tool_rds_dest.json).

The other state machine is just like the corresponding state machine and function in the source account. The state machine is statemachineDeleteOldSnapshotsRDS and it calls lambdaDeleteOldSnapshotsRDS to delete snapshots according to the RetentionDays parameter when the stack is launched. This state machine is, by default, run once each hour. (To change it, you need to change the ScheduleExpression property of the cwEventDeleteOldSnapshotsRDS resource in snapshots_tool_rds_source.json). If it finds a snapshot that is older than the retention time, it deletes the snapshot.

Updating

This tool is fundamentally stateless. The state is mainly in the tags on the snapshots themselves and the parameters to the CloudFormation stack. If you make changes to the parameters or make changes to the Lambda function code, it is best to delete the stack and then launch the stack again.

Authors

License

This project is licensed under the Apache License - see the LICENSE.txt file for details

More Repositories

1

git-secrets

Prevents you from committing secrets and credentials into git repositories
Shell
11,616
star
2

llrt

LLRT (Low Latency Runtime) is an experimental, lightweight JavaScript runtime designed to address the growing demand for fast and efficient Serverless applications.
JavaScript
7,555
star
3

aws-shell

An integrated shell for working with the AWS CLI.
Python
7,116
star
4

autogluon

AutoGluon: AutoML for Image, Text, and Tabular Data
Python
4,348
star
5

aws-cloudformation-templates

A collection of useful CloudFormation templates
Python
4,302
star
6

mountpoint-s3

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
Rust
3,986
star
7

gluonts

Probabilistic time series modeling in Python
Python
3,686
star
8

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Scala
2,871
star
9

aws-lambda-rust-runtime

A Rust runtime for AWS Lambda
Rust
2,829
star
10

aws-sdk-rust

AWS SDK for the Rust Programming Language
2,754
star
11

amazon-redshift-utils

Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
Python
2,643
star
12

diagram-maker

A library to display an interactive editor for any graph-like data.
TypeScript
2,359
star
13

amazon-ecr-credential-helper

Automatically gets credentials for Amazon ECR on docker push/docker pull
Go
2,261
star
14

amazon-eks-ami

Packer configuration for building a custom EKS AMI
Shell
2,164
star
15

aws-lambda-powertools-python

A developer toolkit to implement Serverless best practices and increase developer velocity.
Python
2,148
star
16

aws-well-architected-labs

Hands on labs and code to help you learn, measure, and build using architectural best practices.
Python
1,834
star
17

aws-config-rules

[Node, Python, Java] Repository of sample Custom Rules for AWS Config.
Python
1,473
star
18

smithy

Smithy is a protocol-agnostic interface definition language and set of tools for generating clients, servers, and documentation for any programming language.
Java
1,356
star
19

aws-support-tools

Tools and sample code provided by AWS Premium Support.
Python
1,290
star
20

open-data-registry

A registry of publicly available datasets on AWS
Python
1,199
star
21

sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
Python
1,181
star
22

aws-lambda-powertools-typescript

Powertools is a developer toolkit to implement Serverless best practices and increase developer velocity.
TypeScript
1,179
star
23

dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
Python
1,144
star
24

aws-sdk-ios-samples

This repository has samples that demonstrate various aspects of the AWS SDK for iOS, you can get the SDK source on Github https://github.com/aws-amplify/aws-sdk-ios/
Swift
1,038
star
25

aws-sdk-android-samples

This repository has samples that demonstrate various aspects of the AWS SDK for Android, you can get the SDK source on Github https://github.com/aws-amplify/aws-sdk-android/
Java
1,018
star
26

aws-solutions-constructs

The AWS Solutions Constructs Library is an open-source extension of the AWS Cloud Development Kit (AWS CDK) that provides multi-service, well-architected patterns for quickly defining solutions
TypeScript
1,013
star
27

aws-cfn-template-flip

Tool for converting AWS CloudFormation templates between JSON and YAML formats.
Python
981
star
28

amazon-kinesis-video-streams-webrtc-sdk-c

Amazon Kinesis Video Streams Webrtc SDK is for developers to install and customize realtime communication between devices and enable secure streaming of video, audio to Kinesis Video Streams.
C
975
star
29

aws-lambda-go-api-proxy

lambda-go-api-proxy makes it easy to port APIs written with Go frameworks such as Gin (https://gin-gonic.github.io/gin/ ) to AWS Lambda and Amazon API Gateway.
Go
967
star
30

eks-node-viewer

EKS Node Viewer
Go
947
star
31

multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Java
936
star
32

ec2-spot-labs

Collection of tools and code examples to demonstrate best practices in using Amazon EC2 Spot Instances.
Jupyter Notebook
905
star
33

aws-mobile-appsync-sdk-js

JavaScript library files for Offline, Sync, Sigv4. includes support for React Native
TypeScript
902
star
34

aws-saas-boost

AWS SaaS Boost is a ready-to-use toolset that removes the complexity of successfully running SaaS workloads in the AWS cloud.
Java
901
star
35

fargatecli

CLI for AWS Fargate
Go
891
star
36

aws-api-gateway-developer-portal

A Serverless Developer Portal for easily publishing and cataloging APIs
JavaScript
879
star
37

ecs-refarch-continuous-deployment

ECS Reference Architecture for creating a flexible and scalable deployment pipeline to Amazon ECS using AWS CodePipeline
Shell
842
star
38

fortuna

A Library for Uncertainty Quantification.
Python
836
star
39

dynamodb-data-mapper-js

A schema-based data mapper for Amazon DynamoDB.
TypeScript
818
star
40

goformation

GoFormation is a Go library for working with CloudFormation templates.
Go
812
star
41

flowgger

A fast data collector in Rust
Rust
796
star
42

aws-js-s3-explorer

AWS JavaScript S3 Explorer is a JavaScript application that uses AWS's JavaScript SDK and S3 APIs to make the contents of an S3 bucket easy to browse via a web browser.
HTML
771
star
43

aws-icons-for-plantuml

PlantUML sprites, macros, and other includes for Amazon Web Services services and resources
Python
737
star
44

aws-devops-essential

In few hours, quickly learn how to effectively leverage various AWS services to improve developer productivity and reduce the overall time to market for new product capabilities.
Shell
674
star
45

aws-apigateway-lambda-authorizer-blueprints

Blueprints and examples for Lambda-based custom Authorizers for use in API Gateway.
C#
660
star
46

amazon-ecs-nodejs-microservices

Reference architecture that shows how to take a Node.js application, containerize it, and deploy it as microservices on Amazon Elastic Container Service.
Shell
650
star
47

amazon-kinesis-client

Client library for Amazon Kinesis
Java
621
star
48

aws-deployment-framework

The AWS Deployment Framework (ADF) is an extensive and flexible framework to manage and deploy resources across multiple AWS accounts and regions based on AWS Organizations.
Python
617
star
49

aws-lambda-web-adapter

Run web applications on AWS Lambda
Rust
610
star
50

dgl-lifesci

Python package for graph neural networks in chemistry and biology
Python
594
star
51

aws-security-automation

Collection of scripts and resources for DevSecOps and Automated Incident Response Security
Python
585
star
52

aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Python
565
star
53

python-deequ

Python API for Deequ
Python
535
star
54

aws-athena-query-federation

The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.
Java
507
star
55

data-on-eks

DoEKS is a tool to build, deploy and scale Data Platforms on Amazon EKS
HCL
469
star
56

shuttle

Shuttle is a library for testing concurrent Rust code
Rust
465
star
57

ami-builder-packer

An example of an AMI Builder using CI/CD with AWS CodePipeline, AWS CodeBuild, Hashicorp Packer and Ansible.
465
star
58

route53-dynamic-dns-with-lambda

A Dynamic DNS system built with API Gateway, Lambda & Route 53.
Python
461
star
59

aws-servicebroker

AWS Service Broker
Python
461
star
60

amazon-ecs-local-container-endpoints

A container that provides local versions of the ECS Task Metadata Endpoint and ECS Task IAM Roles Endpoint.
Go
456
star
61

datawig

Imputation of missing values in tables.
JavaScript
454
star
62

aws-jwt-verify

JS library for verifying JWTs signed by Amazon Cognito, and any OIDC-compatible IDP that signs JWTs with RS256, RS384, and RS512
TypeScript
452
star
63

amazon-dynamodb-lock-client

The AmazonDynamoDBLockClient is a general purpose distributed locking library built on top of DynamoDB. It supports both coarse-grained and fine-grained locking.
Java
447
star
64

ecs-refarch-service-discovery

An EC2 Container Service Reference Architecture for providing Service Discovery to containers using CloudWatch Events, Lambda and Route 53 private hosted zones.
Go
444
star
65

ssosync

Populate AWS SSO directly with your G Suite users and groups using either a CLI or AWS Lambda
Go
443
star
66

handwritten-text-recognition-for-apache-mxnet

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.
Jupyter Notebook
442
star
67

awscli-aliases

Repository for AWS CLI aliases.
437
star
68

aws-config-rdk

The AWS Config Rules Development Kit helps developers set up, author and test custom Config rules. It contains scripts to enable AWS Config, create a Config rule and test it with sample ConfigurationItems.
Python
436
star
69

snapchange

Lightweight fuzzing of a memory snapshot using KVM
Rust
427
star
70

aws-security-assessment-solution

An AWS tool to help you create a point in time assessment of your AWS account using Prowler and Scout as well as optional AWS developed ransomware checks.
423
star
71

lambda-refarch-mapreduce

This repo presents a reference architecture for running serverless MapReduce jobs. This has been implemented using AWS Lambda and Amazon S3.
JavaScript
422
star
72

aws-lambda-cpp

C++ implementation of the AWS Lambda runtime
C++
409
star
73

aws-cloudsaga

AWS CloudSaga - Simulate security events in AWS
Python
389
star
74

amazon-kinesis-producer

Amazon Kinesis Producer Library
C++
385
star
75

soci-snapshotter

Go
383
star
76

pgbouncer-fast-switchover

Adds query routing and rewriting extensions to pgbouncer
C
381
star
77

serverless-photo-recognition

A collection of 3 lambda functions that are invoked by Amazon S3 or Amazon API Gateway to analyze uploaded images with Amazon Rekognition and save picture labels to ElasticSearch (written in Kotlin)
Kotlin
378
star
78

amazon-sagemaker-workshop

Amazon SageMaker workshops: Introduction, TensorFlow in SageMaker, and more
Jupyter Notebook
378
star
79

serverless-rules

Compilation of rules to validate infrastructure-as-code templates against recommended practices for serverless applications.
Go
378
star
80

logstash-output-amazon_es

Logstash output plugin to sign and export logstash events to Amazon Elasticsearch Service
Ruby
374
star
81

kinesis-aggregation

AWS libraries/modules for working with Kinesis aggregated record data
Java
370
star
82

smithy-rs

Code generation for the AWS SDK for Rust, as well as server and generic smithy client generation.
Rust
369
star
83

syne-tune

Large scale and asynchronous Hyperparameter and Architecture Optimization at your fingertips.
Python
363
star
84

aws-sdk-kotlin

Multiplatform AWS SDK for Kotlin
Kotlin
359
star
85

dynamodb-transactions

Java
354
star
86

amazon-kinesis-client-python

Amazon Kinesis Client Library for Python
Python
354
star
87

aws-serverless-data-lake-framework

Enterprise-grade, production-hardened, serverless data lake on AWS
Python
349
star
88

threat-composer

A simple threat modeling tool to help humans to reduce time-to-value when threat modeling
TypeScript
346
star
89

amazon-kinesis-agent

Continuously monitors a set of log files and sends new data to the Amazon Kinesis Stream and Amazon Kinesis Firehose in near-real-time.
Java
342
star
90

amazon-kinesis-scaling-utils

The Kinesis Scaling Utility is designed to give you the ability to scale Amazon Kinesis Streams in the same way that you scale EC2 Auto Scaling groups – up or down by a count or as a percentage of the total fleet. You can also simply scale to an exact number of Shards. There is no requirement for you to manage the allocation of the keyspace to Shards when using this API, as it is done automatically.
Java
333
star
91

amazon-kinesis-video-streams-producer-sdk-cpp

Amazon Kinesis Video Streams Producer SDK for C++ is for developers to install and customize for their connected camera and other devices to securely stream video, audio, and time-encoded data to Kinesis Video Streams.
C++
332
star
92

landing-zone-accelerator-on-aws

Deploy a multi-account cloud foundation to support highly-regulated workloads and complex compliance requirements.
TypeScript
330
star
93

route53-infima

Library for managing service-level fault isolation using Amazon Route 53.
Java
326
star
94

aws-automated-incident-response-and-forensics

326
star
95

mxboard

Logging MXNet data for visualization in TensorBoard.
Python
326
star
96

aws-sigv4-proxy

This project signs and proxies HTTP requests with Sigv4
Go
325
star
97

statelint

A Ruby gem that provides a command-line validator for Amazon States Language JSON files.
Ruby
324
star
98

graphstorm

Enterprise graph machine learning framework for billion-scale graphs for ML scientists and data scientists.
Python
317
star
99

ecs-nginx-reverse-proxy

Reference architecture for deploying Nginx on ECS, both as a basic static resource server, and as a reverse proxy in front of a dynamic application server.
Nginx
317
star
100

simplebeerservice

Simple Beer Service (SBS) is a cloud-connected kegerator that streams live sensor data to AWS.
JavaScript
316
star