• Stars
    star
    439
  • Rank 95,381 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A security toolkit for Amazon S3

s3tk

A security toolkit for Amazon S3

Another day, another leaky Amazon S3 bucket

β€” The Register, 12 Jul 2017

Don’t be the... next... big... data... leak

Screenshot

🍊 Battle-tested at Instacart

Installation

Run:

pip install s3tk

You can use the AWS CLI or AWS Vault to set up your AWS credentials:

pip install awscli
aws configure

See IAM policies needed for each command.

Commands

Scan

Scan your buckets for:

  • ACL open to public
  • policy open to public
  • public access blocked
  • logging enabled
  • versioning enabled
  • default encryption enabled
s3tk scan

Only run on specific buckets

s3tk scan my-bucket my-bucket-2

Also works with wildcards

s3tk scan "my-bucket*"

Confirm correct log bucket(s) and prefix

s3tk scan --log-bucket my-s3-logs --log-bucket other-region-logs --log-prefix "{bucket}/"

Check CloudTrail object-level logging [experimental]

s3tk scan --object-level-logging

Skip logging, versioning, or default encryption

s3tk scan --skip-logging --skip-versioning --skip-default-encryption

Get email notifications of failures (via SNS)

s3tk scan --sns-topic arn:aws:sns:...

List Policy

List bucket policies

s3tk list-policy

Only run on specific buckets

s3tk list-policy my-bucket my-bucket-2

Show named statements

s3tk list-policy --named

Set Policy

Note: This replaces the previous policy

Only private uploads

s3tk set-policy my-bucket --no-object-acl

Delete Policy

Delete policy

s3tk delete-policy my-bucket

Block Public Access

Block public access on specific buckets

s3tk block-public-access my-bucket my-bucket-2

Use the --dry-run flag to test

Enable Logging

Enable logging on all buckets

s3tk enable-logging --log-bucket my-s3-logs

Only on specific buckets

s3tk enable-logging my-bucket my-bucket-2 --log-bucket my-s3-logs

Set log prefix ({bucket}/ by default)

s3tk enable-logging --log-bucket my-s3-logs --log-prefix "logs/{bucket}/"

Use the --dry-run flag to test

A few notes about logging:

  • buckets with logging already enabled are not updated at all
  • the log bucket must in the same region as the source bucket - run this command multiple times for different regions
  • it can take over an hour for logs to show up

Enable Versioning

Enable versioning on all buckets

s3tk enable-versioning

Only on specific buckets

s3tk enable-versioning my-bucket my-bucket-2

Use the --dry-run flag to test

Enable Default Encryption

Enable default encryption on all buckets

s3tk enable-default-encryption

Only on specific buckets

s3tk enable-default-encryption my-bucket my-bucket-2

This does not encrypt existing objects - use the encrypt command for this

Use the --dry-run flag to test

Scan Object ACL

Scan ACL on all objects in a bucket

s3tk scan-object-acl my-bucket

Only certain objects

s3tk scan-object-acl my-bucket --only "*.pdf"

Except certain objects

s3tk scan-object-acl my-bucket --except "*.jpg"

Reset Object ACL

Reset ACL on all objects in a bucket

s3tk reset-object-acl my-bucket

This makes all objects private. See bucket policies for how to enforce going forward.

Use the --dry-run flag to test

Specify certain objects the same way as scan-object-acl

Encrypt

Encrypt all objects in a bucket with server-side encryption

s3tk encrypt my-bucket

Use S3-managed keys by default. For KMS-managed keys, use:

s3tk encrypt my-bucket --kms-key-id arn:aws:kms:...

For customer-provided keys, use:

s3tk encrypt my-bucket --customer-key secret-key

Use the --dry-run flag to test

Specify certain objects the same way as scan-object-acl

Note: Objects will lose any custom ACL

Delete Unencrypted Versions

Delete all unencrypted versions of objects in a bucket

s3tk delete-unencrypted-versions my-bucket

For safety, this will not delete any current versions of objects

Use the --dry-run flag to test

Specify certain objects the same way as scan-object-acl

Scan DNS

Scan Route 53 for buckets to make sure you own them

s3tk scan-dns

Otherwise, you may be susceptible to subdomain takeover

Credentials

Credentials can be specified in ~/.aws/credentials or with environment variables. See this guide for an explanation of environment variables.

You can specify a profile to use with:

AWS_PROFILE=your-profile s3tk

IAM Policies

Here are the permissions needed for each command. Only include statements you need.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Scan",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:GetBucketAcl",
                "s3:GetBucketPolicy",
                "s3:GetBucketPublicAccessBlock",
                "s3:GetBucketLogging",
                "s3:GetBucketVersioning",
                "s3:GetEncryptionConfiguration"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ScanObjectLevelLogging",
            "Effect": "Allow",
            "Action": [
                "cloudtrail:ListTrails",
                "cloudtrail:GetTrail",
                "cloudtrail:GetEventSelectors",
                "s3:GetBucketLocation"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ScanDNS",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "route53:ListHostedZones",
                "route53:ListResourceRecordSets"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ListPolicy",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:GetBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SetPolicy",
            "Effect": "Allow",
            "Action": [
                "s3:PutBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "DeletePolicy",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "BlockPublicAccess",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutBucketPublicAccessBlock"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EnableLogging",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutBucketLogging"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EnableVersioning",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutBucketVersioning"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EnableDefaultEncryption",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutEncryptionConfiguration"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ResetObjectAcl",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObjectAcl",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket",
                "arn:aws:s3:::my-bucket/*"
            ]
        },
        {
            "Sid": "Encrypt",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket",
                "arn:aws:s3:::my-bucket/*"
            ]
        },
        {
            "Sid": "DeleteUnencryptedVersions",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucketVersions",
                "s3:GetObjectVersion",
                "s3:DeleteObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket",
                "arn:aws:s3:::my-bucket/*"
            ]
        }
    ]
}

Access Logs

Amazon Athena is great for querying S3 logs. Create a table (thanks to this post for the table structure) with:

CREATE EXTERNAL TABLE my_bucket (
    bucket_owner string,
    bucket string,
    time string,
    remote_ip string,
    requester string,
    request_id string,
    operation string,
    key string,
    request_verb string,
    request_url string,
    request_proto string,
    status_code string,
    error_code string,
    bytes_sent string,
    object_size string,
    total_time string,
    turn_around_time string,
    referrer string,
    user_agent string,
    version_id string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
    'serialization.format' = '1',
    'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\\") ([^ ]*)$'
) LOCATION 's3://my-s3-logs/my-bucket/';

Change the last line to point to your log bucket (and prefix) and query away

SELECT
    date_parse(time, '%d/%b/%Y:%H:%i:%S +0000') AS time,
    request_url,
    remote_ip,
    user_agent
FROM
    my_bucket
WHERE
    requester = '-'
    AND status_code LIKE '2%'
    AND request_url LIKE '/some-keys%'
ORDER BY 1

CloudTrail Logs

Amazon Athena is also great for querying CloudTrail logs. Create a table (thanks to this post for the table structure) with:

CREATE EXTERNAL TABLE cloudtrail_logs (
    eventversion STRING,
    userIdentity STRUCT<
        type:STRING,
        principalid:STRING,
        arn:STRING,
        accountid:STRING,
        invokedby:STRING,
        accesskeyid:STRING,
        userName:String,
        sessioncontext:STRUCT<
            attributes:STRUCT<
                mfaauthenticated:STRING,
                creationdate:STRING>,
            sessionIssuer:STRUCT<
                type:STRING,
                principalId:STRING,
                arn:STRING,
                accountId:STRING,
                userName:STRING>>>,
    eventTime STRING,
    eventSource STRING,
    eventName STRING,
    awsRegion STRING,
    sourceIpAddress STRING,
    userAgent STRING,
    errorCode STRING,
    errorMessage STRING,
    requestId  STRING,
    eventId  STRING,
    resources ARRAY<STRUCT<
        ARN:STRING,
        accountId:STRING,
        type:STRING>>,
    eventType STRING,
    apiVersion  STRING,
    readOnly BOOLEAN,
    recipientAccountId STRING,
    sharedEventID STRING,
    vpcEndpointId STRING,
    requestParameters STRING,
    responseElements STRING,
    additionalEventData STRING,
    serviceEventDetails STRING
)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED  AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION  's3://my-cloudtrail-logs/'

Change the last line to point to your CloudTrail log bucket and query away

SELECT
    eventTime,
    eventName,
    userIdentity.userName,
    requestParameters
FROM
    cloudtrail_logs
WHERE
    eventName LIKE '%Bucket%'
ORDER BY 1

Best Practices

Keep things simple and follow the principle of least privilege to reduce the chance of mistakes.

  • Strictly limit who can perform bucket-related operations
  • Avoid mixing objects with different permissions in the same bucket (use a bucket policy to enforce this)
  • Don’t specify public read permissions on a bucket level (no GetObject in bucket policy)
  • Monitor configuration frequently for changes

Bucket Policies

Only private uploads

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:PutObjectAcl",
            "Resource": "arn:aws:s3:::my-bucket/*"
        }
    ]
}

Performance

For commands that iterate over bucket objects (scan-object-acl, reset-object-acl, encrypt, and delete-unencrypted-versions), run s3tk on an EC2 server for minimum latency.

Notes

The set-policy, block-public-access, enable-logging, enable-versioning, and enable-default-encryption commands are provided for convenience. We recommend Terraform for managing your buckets.

resource "aws_s3_bucket" "my_bucket" {
  bucket = "my-bucket"
  acl    = "private"

  logging {
    target_bucket = "my-s3-logs"
    target_prefix = "my-bucket/"
  }

  versioning {
    enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "my_bucket" {
  bucket = "${aws_s3_bucket.my_bucket.id}"

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Upgrading

Run:

pip install s3tk --upgrade

To use master, run:

pip install git+https://github.com/ankane/s3tk.git --upgrade

Docker

Run:

docker run -it ankane/s3tk aws configure

Commit your credentials:

docker commit $(docker ps -l -q) my-s3tk

And run:

docker run -it my-s3tk s3tk scan

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/s3tk.git
cd s3tk
pip install -r requirements.txt

More Repositories

1

pghero

A performance dashboard for Postgres
Ruby
7,123
star
2

searchkick

Intelligent search made easy
Ruby
6,257
star
3

chartkick

Create beautiful JavaScript charts with one line of Ruby
Ruby
6,157
star
4

blazer

Business intelligence made simple
Ruby
4,351
star
5

ahoy

Simple, powerful, first-party analytics for Rails
Ruby
3,872
star
6

strong_migrations

Catch unsafe migrations in development
Ruby
3,662
star
7

groupdate

The simplest way to group temporal data
Ruby
3,617
star
8

pgsync

Sync data from one Postgres database to another
Ruby
2,787
star
9

the-ultimate-guide-to-ruby-timeouts

Timeouts for popular Ruby gems
Ruby
2,212
star
10

production_rails

Best practices for running Rails in production
1,975
star
11

dexter

The automatic indexer for Postgres
Ruby
1,491
star
12

lockbox

Modern encryption for Ruby and Rails
Ruby
1,290
star
13

chartkick.js

Create beautiful charts with one line of JavaScript
JavaScript
1,211
star
14

react-chartkick

Create beautiful JavaScript charts with one line of React
JavaScript
1,183
star
15

pretender

Log in as another user in Rails
Ruby
1,124
star
16

ahoy_email

First-party email analytics for Rails
Ruby
1,051
star
17

secure_rails

Rails security best practices
954
star
18

pgslice

Postgres partitioning as easy as pie
Ruby
953
star
19

mailkick

Email subscriptions for Rails
Ruby
847
star
20

vue-chartkick

Create beautiful JavaScript charts with one line of Vue
JavaScript
747
star
21

eps

Machine learning for Ruby
Ruby
609
star
22

awesome-legal

Awesome free legal documents for companies
589
star
23

searchjoy

Search analytics made easy
Ruby
579
star
24

polars-ruby

Blazingly fast DataFrames for Ruby
Ruby
563
star
25

torch.rb

Deep learning for Ruby, powered by LibTorch
Ruby
552
star
26

blind_index

Securely search encrypted database fields
Ruby
470
star
27

safely

Rescue and report exceptions in non-critical code
Ruby
470
star
28

authtrail

Track Devise login activity
Ruby
466
star
29

ahoy.js

Simple, powerful JavaScript analytics
JavaScript
463
star
30

multiverse

Multiple databases for Rails πŸŽ‰
Ruby
463
star
31

hightop

A nice shortcut for group count queries
Ruby
462
star
32

field_test

A/B testing for Rails
Ruby
460
star
33

disco

Recommendations for Ruby and Rails using collaborative filtering
Ruby
431
star
34

active_median

Median and percentile for Active Record, Mongoid, arrays, and hashes
Ruby
427
star
35

informers

State-of-the-art natural language processing for Ruby
Ruby
417
star
36

notable

Track notable requests and background jobs
Ruby
402
star
37

shorts

Short, random tutorials and posts
379
star
38

tensorflow-ruby

Deep learning for Ruby
Ruby
350
star
39

distribute_reads

Scale database reads to replicas in Rails
Ruby
328
star
40

slowpoke

Rack::Timeout enhancements for Rails
Ruby
327
star
41

prophet-ruby

Time series forecasting for Ruby
Ruby
321
star
42

rover

Simple, powerful data frames for Ruby
Ruby
311
star
43

groupdate.sql

The simplest way to group temporal data
PLpgSQL
280
star
44

kms_encrypted

Simple, secure key management for Lockbox and attr_encrypted
Ruby
235
star
45

jetpack

A friendly package manager for R
R
234
star
46

neighbor

Nearest neighbor search for Rails and Postgres
Ruby
230
star
47

rollup

Rollup time-series data in Rails
Ruby
230
star
48

hypershield

Shield sensitive data in Postgres and MySQL
Ruby
227
star
49

logstop

Keep personal data out of your logs
Ruby
218
star
50

pdscan

Scan your data stores for unencrypted personal data (PII)
Go
213
star
51

delete_in_batches

Fast batch deletes for Active Record and Postgres
Ruby
202
star
52

vega-ruby

Interactive charts for Ruby, powered by Vega and Vega-Lite
Ruby
192
star
53

mapkick

Create beautiful JavaScript maps with one line of Ruby
Ruby
173
star
54

dbx

A fast, easy-to-use database library for R
R
171
star
55

fastText-ruby

Efficient text classification and representation learning for Ruby
Ruby
162
star
56

autosuggest

Autocomplete suggestions based on what your users search
Ruby
162
star
57

swipeout

Swipe-to-delete goodness for the mobile web
JavaScript
159
star
58

pghero.sql

Postgres insights made easy
PLpgSQL
154
star
59

mainstreet

Address verification for Ruby and Rails
Ruby
149
star
60

or-tools-ruby

Operations research tools for Ruby
Ruby
139
star
61

mapkick.js

Create beautiful, interactive maps with one line of JavaScript
JavaScript
138
star
62

trend-ruby

Anomaly detection and forecasting for Ruby
Ruby
128
star
63

mitie-ruby

Named-entity recognition for Ruby
Ruby
122
star
64

barkick

Barcodes made easy
Ruby
120
star
65

ownership

Code ownership for Rails
Ruby
111
star
66

anomaly

Easy-to-use anomaly detection for Ruby
Ruby
98
star
67

errbase

Common exception reporting for a variety of services
Ruby
87
star
68

tokenizers-ruby

Fast state-of-the-art tokenizers for Ruby
Rust
81
star
69

ip_anonymizer

IP address anonymizer for Ruby and Rails
Ruby
79
star
70

str_enum

String enums for Rails
Ruby
75
star
71

faiss-ruby

Efficient similarity search and clustering for Ruby
C++
73
star
72

trend-api

Anomaly detection and forecasting API
R
71
star
73

archer

Rails console history for Heroku, Docker, and more
Ruby
70
star
74

onnxruntime-ruby

Run ONNX models in Ruby
Ruby
70
star
75

xgboost-ruby

High performance gradient boosting for Ruby
Ruby
69
star
76

secure-spreadsheet

Encrypt and password protect sensitive CSV and XLSX files
JavaScript
66
star
77

active_hll

HyperLogLog for Rails and Postgres
Ruby
66
star
78

guess

Statistical gender detection for Ruby
Ruby
60
star
79

morph

An encrypted, in-memory, key-value store
C++
59
star
80

lightgbm

High performance gradient boosting for Ruby
Ruby
56
star
81

midas-ruby

Edge stream anomaly detection for Ruby
Ruby
54
star
82

moves

Ruby client for Moves
Ruby
54
star
83

blingfire-ruby

High speed text tokenization for Ruby
Ruby
54
star
84

vowpalwabbit-ruby

Fast online machine learning for Ruby
Ruby
52
star
85

xlearn-ruby

High performance factorization machines for Ruby
Ruby
51
star
86

tomoto-ruby

High performance topic modeling for Ruby
C++
51
star
87

trove

Deploy machine learning models in Ruby (and Rails)
Ruby
50
star
88

ahoy_events

Simple, powerful event tracking for Rails
Ruby
42
star
89

mapkick-static

Create beautiful static maps with one line of Ruby
Ruby
42
star
90

practical-search

Let’s make search a better experience for our users
40
star
91

breakout-ruby

Breakout detection for Ruby
Ruby
40
star
92

plu

Price look-up codes made easy
Ruby
40
star
93

ngt-ruby

High-speed approximate nearest neighbors for Ruby
Ruby
39
star
94

gindex

Concurrent index migrations for Rails
Ruby
39
star
95

clockwork_web

A web interface for Clockwork
Ruby
38
star
96

ahoy_guide

A foundation of knowledge and libraries for solid analytics
38
star
97

notable_web

A web interface for Notable
HTML
36
star
98

AnomalyDetection.rb

Time series anomaly detection for Ruby
Ruby
34
star
99

khiva-ruby

High-performance time series algorithms for Ruby
Ruby
34
star
100

immudb-ruby

Ruby client for immudb, the immutable database
Ruby
34
star