• Stars
    star
    150
  • Rank 247,323 (Top 5 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created about 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This project sets up partitioned Athena tables for your CloudTrail logs and updates the partitions nightly. As new AWS accounts begin sending you logs or new AWS regions come online, your partitions will always be up-to-date. It is based on work by Alex Smolen in his post Partitioning CloudTrail Logs in Athena.

You can immediately deploy the CDK app, but I recommend first running this manaully to ensure everything is configured, and also because running it manually will (by default) create 90 days of partitions, whereas the nightly CDK will not run until 0600 UTC, and will only create partitions for the current day and tomorrow.

Tables are created for each account as cloudtrail_000000000000 and also a view is created that unions all these tables.

Related projects

This project was based on work by Alex Smolen. This project works great for many, but at enough scale (roughly 100GB of Cloudtrail logs), the way in which Athena is used with this project runs into problems. For this and other reasons, Alex released a new project cloudtrail-parquet-glue which is described in his post Use AWS Glue to make CloudTrail Parquet partitions and resolves issues #13 and #14 with this project.

Setup

Edit config/config.yaml to specify the S3 bucket containing your CloudTrail logs, the SNS to send alarms to (you must create one if you don't already have one) and any other configuraiton info.

Set up the initial tables and partitions for the past 90 days (it is ok if you don't have that many logs), by running:

cd resources/partitioner
pip3 install pyyaml boto3 -t .
python3 main.py

Then deploy the nightly Lambda from the root directory:

npm i
cdk deploy

If you haven't used the cdk before, you may need to run cdk bootstrap aws://000000000000/us-east-1 (replacing your account ID and region) before running cdk deploy.

Using Athena

To query your tables, use the AWS Console to get to the Athena service in the region where this was deployed. Here is an example query to list all of the data for some events:

SELECT *
FROM cloudtrail_000000000000
WHERE region = 'us-east-1' AND year = '2019' AND month = '09' AND day = '30'
LIMIT 5;

That query limits the data searched to a specific region and day (using the partitions) and a specific account.

This next query shows the most common errors by user (technically by ARN for the session).

SELECT 
  useridentity.arn, 
  errorcode, 
  count(*) AS count 
FROM cloudtrail_000000000000
WHERE year = '2019' AND month = '09' AND day = '30' 
  AND errorcode != '' 
GROUP BY errorcode, useridentity.arn 
ORDER BY count DESC 
LIMIT 50;

This next query shows the API calls made by a specific user.

SELECT 
  eventname, count(*) AS COUNT
FROM cloudtrail_000000000000
WHERE year = '2019' AND month = '09' and day = '30'
  AND useridentity.arn like '%alice%'
GROUP BY eventname
ORDER BY COUNT DESC

This next query shows which accounts have been accessed from a specific IP address.

SELECT 
  recipientaccountid, count(*) AS COUNT
FROM cloudtrail
WHERE year = '2019' AND month = '09'
  AND sourceipaddress = '1.2.3.4'
GROUP BY recipientaccountid 
ORDER BY COUNT DESC

For more ideas of what to look for, see https://github.com/easttimor/aws-incident-response

More Repositories

1

cloudmapper

CloudMapper helps you analyze your Amazon Web Services (AWS) environments.
JavaScript
5,990
star
2

parliament

AWS IAM linting library
Python
1,044
star
3

webauthn

WebAuthn (FIDO2) server library written in Go
Go
1,028
star
4

cloudtracker

CloudTracker helps you find over-privileged IAM users and roles by comparing CloudTrail logs with current IAM policies.
Python
885
star
5

py_webauthn

Pythonic WebAuthn 🐍
Python
863
star
6

webauthn.io

The source code for webauthn.io, a demonstration of WebAuthn.
Python
654
star
7

EFIgy

A small client application that uses the Duo Labs EFIgy API to inform you about the state of your Mac EFI firmware
Python
512
star
8

dlint

Dlint is a tool for encouraging best coding practices and helping ensure we're writing secure Python code.
Python
331
star
9

markdown-to-confluence

Syncs Markdown files to Confluence
Python
307
star
10

isthislegit

Dashboard to collect, analyze, and respond to reported phishing emails.
Python
286
star
11

idapython

A collection of IDAPython modules made with 💚 by Duo Labs
Python
285
star
12

chrome-extension-boilerplate

Boilerplate code for a Chrome extension using TypeScript, React, and Webpack.
TypeScript
209
star
13

secret-bridge

Monitors Github for leaked secrets
Python
189
star
14

apple-t2-xpc

Tools to explore the XPC interface of Apple's T2 chip
Python
160
star
15

twitterbots

The code used in the "Don't @ Me: Hunting Twitter Bots at Scale" Black Hat presentation
Python
151
star
16

phish-collect

Python script to hunt phishing kits
Python
137
star
17

phinn

A toolkit to generate an offline Chrome extension to detect phishing attacks using a bespoke convolutional neural network.
JavaScript
130
star
18

xray

X-Ray allows you to scan your Android device for security vulnerabilities that put your device at risk.
Java
121
star
19

android-webauthn-authenticator

A WebAuthn Authenticator for Android leveraging hardware-backed key storage and biometric user verification.
Java
110
star
20

appsec-education

Presentations, training modules, and other education materials from Duo Security's Application Security team.
JavaScript
71
star
21

mysslstrip

CVE-2015-3152 PoC
Python
43
star
22

EFIgy-GUI

A Mac app that uses the Duo Labs EFIgy API to inform you about the state of your EFI firmware.
Objective-C
40
star
23

lookalike-domains

generate lookalike domains using a few simple techniques (homoglyphs, alt TLDs, prefix/suffix)
Python
31
star
24

apk2java

Automatically decompile APK's using Docker
Dockerfile
23
star
25

journal

The boilerplate for a new Journal site
21
star
26

srtgen

Automatic '.srt' subtitle generator
Python
21
star
27

markflow

Make your Markdown sparkle!
Python
20
star
28

neustar2mmdb

Tool to convert Neustar's GeoPoint data to Maxmind's GeoIP database format for ease of use.
Python
19
star
29

narrow

Low-effort reachability analysis for third-party code vulnerabilities.
Python
19
star
30

datasci-ctf

A capture-the-flag exercise based on data analysis challenges
Jupyter Notebook
16
star
31

duo-blog-going-passwordless-with-py-webauthn

Python
15
star
32

tutorials

Python
15
star
33

sharedsignals

Python tools for using OpenID's Shared Signals Framework (including CAEP)
15
star
34

chain-of-fools

A set of tools that allow researchers to experiment with certificate chain validation issues
Python
13
star
35

journal-cli

The command-line client for Journal
Jupyter Notebook
12
star
36

unmasking_data_leaks

The code from the talk "Unmasking Data Leaks: A Guide to Finding, Fixing, and Prevention" given at BSides SATX 2019.
Python
7
star
37

journal-theme

The Hugo theme that powers Journal
HTML
7
star
38

golang-workshop

Source files for a Golang Workshop
Go
5
star
39

vimes

A local DNS proxy based on CoreDNS.
Python
3
star
40

journal-docs

The documentation for Journal
2
star
41

dlint-plugin-example

An example plugin for dlint
Python
2
star
42

twitterbots-wallpapers

Wallpapers created from the crawlers in our "Don't @ Me" technical research paper
1
star
43

holidayhack-2019

Scripts and artifacts used to solve the 2019 SANS Holiday Hack Challenge
Python
1
star