• Stars
    star
    109
  • Rank 319,077 (Top 7 %)
  • Language
    Go
  • Created almost 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Framework for inspecting and editing traffic in AWS VPCs

Twitter thread

flowdog

This is an application/framework for inspection and manipulation of network traffic in AWS VPCs. Packets routed to or from the Internet, between VPCs, between subnets can all be programmatically inspected or modified in great detail.

This is achieved via AWS Gateway Load Balancers. GWLBs are a cloud-native alternative to NAT instances. They can auto-scale, they can be highly available across availability zones and they can even be provided as managed services from entirely separate AWS accounts.

But they're hard to use*. This project tries to make them easier. See further down for an explanation of the difficulty.

Example use cases

These are really just intended to demonstrate that anything is possible in the world of software-defined networking. Please ping me on Twitter with any cool ideas you have. Or any enhancements to the following ideas.

  • lambda_acceptor/lambda_acceptor.go takes the idea of AWS API Gateway Lambda authorizers and applies it to VPC flows. At the start of every new connection, a Lambda function is invoked and returns a decision about whether to allow or drop the connection. It's like security groups 2.0. Input/output looks like this:

    authorizer-io

  • cloudfront_functions/rick.js is an example of how the CloudFront Functions event model can be applied to rewriting HTTP(S) requests inside a VPC. In this particular example, we're ensuring that any AWS Workspaces users visiting YouTube can only watch one particular video.

  • flowdogshark/flowdogshark.go is an extcap plugin for Wireshark that allows you to live-attach Wireshark to flowdog and capture traffic flowing through your VPC. Given that flowdog does TLS interception (see later section in README), it can even use Wireshark's support for decoding TLS. Here's an example of intercepting the Amazon SSM agent:

    wireshark demo

  • account_id_emf/account_id_emf.go is an example of scanning all AWS API calls made within the VPC for SigV4 auth headers, extracting the AWS account ID and emitting it to CloudWatch via specially-formatted logs that are turned into metrics. This could be used to alert on newly-seen account IDs: a potential indicator of a compromised instance.

  • upsidedown/upsidedown.go is an implementation of the classic Upside-Down-Ternet. It blurs and rotates every image 180ΒΊ when browsing the net.

    upside down

  • sts_rickroll/sts_rickroll.go is another silly example. Here we are modifying the response of the AWS API call for aws sts get-caller-identity to return something unexpected. You could equally use the same logic to return your favourite video on every seventh object downloaded through an S3 VPC gateway.

    sts-rickroll

  • gwlb/websocket.go is not an example, but I got lazy. Nick Frichette had the great suggestion of intercepting the SSM agent for shenanigans. This code will detect websockets and parse messages, but right now only passes them back and forth. Soonβ„’.

  • TODO: You could save HAR archives of all web traffic to buckets in S3 for later perusal.

What about TLS?

As great as GWLBs are, they're not magic. We haven't broken TLS. For this app, we create a custom root certificate authority and add it to the trust store on our EC2 instances. Rather than deal in sensitive private key material, we use AWS KMS' support for asymmetric keys for our private key. generate.go creates a certificate using that key. That certificate is then stored and trusted on the OS (e.g. in Amazon Linux 2 you would run cat $CERT >> /usr/share/pki/ca-trust-source/anchors/lol.pem && update-ca-trust)

Rather than invoking KMS on every TLS connection, on launch this app creates an ephemeral key pair and certificate in memory, asks KMS to sign it and then uses that as an intermediate certificate authority. This means we can have fast TLS de/re-encryption with no stored secrets.

When Wireshark is attached, flowdog can stream TLS key logs in NSS Key Log Format. This allows the Wireshark user to view all decrypted TLS traffic without giving away either the KMS private key (impossible) or intermediate CA private key (very unwise).

Why so hard?

(*) GWLBs aren't hard themselves, look at this diagram (from Amazon's blog post):

amazon diagram

It's inspecting and modifying network traffic in general that is extremely difficult. Especially non-trivial modifications. Take the following diagram as an example. This is just one packet in a flow of packets between an EC2 instance and the Internet when curl https://google.com is run.

packet diagram

You can think of this packet as having many layers. Each layer "wraps" the layer below it. The bottom six layers were sent by the EC2 instance. The top three layers are GWLB-specific. They identify which VPC endpoint (e.g. customer) the packet came from and which "flow" of packets this particular packet belongs to.

Say we want to change all web requests to google.com to have the User-Agent request header instead be lower-case, e.g. user-agent. This would require us to parse the formats for:

  • The inner IPv4 layer, to identify is this a TCP packet
  • The TCP layer, to identify if the destination port is 80 or 443
  • The TLS layer, to (magically, for now) decrypt the payload
  • The HTTP/2 layer, to inspect the multiplexed streams within
  • The frames in each HTTP/2 stream, to identify if they are a HEADERS frame.
  • The headers in the HTTP/2 frame, to see if the User-Agent header is present.

Finally we would have to edit the packet in memory at the right offset to change U to u and A to a, correct the checksums at every layer of the packet and re-encrypt the TLS payload. That's a lot of work.

And that's a trivial change: the packet length hasn't changed. Imagine if wanted to insert a few additional headers in that request. Maybe that would push the packet length over the typical 1500 byte limit for packets on the Internet. That increases the amount of work needed by orders of magnitude: now we need to reimplement the TCP state machine, because we'll now need two packets. And those packets each need sequence numbers. But the original EC2 instance will get a response from Google for sequence numbers it didn't expect, so the connection will fail. So what we need to do is instead terminate the TCP connection at the GWLB appliance and open a new connection to Google from the GWLB appliance. The app will need to juggle these two TCP connections and pass the underlying data to and from Google and the EC2 instance, all while keeping the two connection's different states in sync.

That's so much work that it's no wonder that even after more than a year, only massive well-funded vendors have implemented this capability. And even then, it looks like they're limited to either read-only inspection or dropping suspicious packets.

vendors

So I built this thing. It uses a handful of packages to make traffic inspection and modification accessible to even developers like you or me. Those packages are:

  • inet.af/netstack: a reimplementation of the entire Linux TCP/IP stack in Go, extracted from the gVisor project.

  • github.com/google/gopacket to extract and parse the Geneve, IP, TCP, UDP, etc layers from the raw packets delivered by the GWLB.

  • httputil in the Go stdlib, to reverse-proxy HTTP and HTTPS traffic and parse flows into individual request and response objects.

  • github.com/aws/aws-sdk-go to use AWS KMS asymmetric keys for the root certificate authority that can be installed on EC2 instances for transparent TLS decryption - without having to manage a highly-sensitive private key.

  • rogchap.com/v8go to embed the V8 JavaScript engine into Go, so that we can write scripts to modify traffic in JS, which is more familiar than Go to many developers.

More Repositories

1

osx-abi-macho-file-format-reference

Mirror of OS X ABI Mach-O File Format Reference
761
star
2

ipv6-ghost-ship

Silly usage of AWS EC2 IPv6 prefixes
Go
322
star
3

MagicKit

MagicKit is an Objective-C file identification framework based on libmagic.
C
194
star
4

cloudkey

No need for IAM users when we have Yubikeys
Go
158
star
5

rdsconn

rdsconn makes connecting to an AWS RDS instance inside a VPC from your laptop easier
Go
107
star
6

openrolesanywhere

Open-source proof-of-concept client for AWS IAM Roles Anywhere
Go
70
star
7

centralized-logs

Centralizing AWS CloudWatch log forwarding via EventBridge and Step Functions
49
star
8

s3zipper

A tool that allows downloading S3 directories as ZIP files
Go
34
star
9

jwtex

A serverless JWT exchanger and OIDC IdP
Go
32
star
10

ima.ge.cx

TypeScript
31
star
11

freedata

A silly project for free (maybe) egress from EC2 instances using Tailscale and Session Manager
Go
31
star
12

aws_sdk.nim

Nim
28
star
13

demo-serverless-aspnetcore

ASP.Net Core 3.1 on AWS Lambda demo
C#
24
star
14

cloudenv

Go
23
star
15

postinvoke

Run in-process code after your Go-powered Lambda function has returned
Go
23
star
16

awsaccountcreds

Go
21
star
17

secretsctx

Go
20
star
18

ses-sidecar

An SMTP server sidecar to allow AWS SES usage with IAM roles
Go
15
star
19

GEBEncoding

An Objective-C BEncoding Library
Objective-C
11
star
20

vpcdelorean

Go
9
star
21

sph

Nim
9
star
22

ima.ge.cx-backend

Go
8
star
23

sphlib

C
7
star
24

serverful

Go
7
star
25

prelink_unpack

Tool for unpacking the prelinked kernel on iOS.
Python
7
star
26

go-xrayprofile

Selective profiling of AWS Lambda functions
Go
6
star
27

lzo

Ruby
6
star
28

iphone_detect

C
5
star
29

cwemf-to-honeycomb

Go
5
star
30

matconnect

A silly proof-of-concept for VPC network nonsense
Go
5
star
31

sshcontainers

Go
4
star
32

SSCrypto

Unofficial mirror of the SSCrypto.framework wrapper around OpenSSL
Objective-C
4
star
33

ghcs

Nim
4
star
34

ios_sig

C
3
star
35

freedumb

Go
3
star
36

ghal

ghal allows streaming of live GitHub Actions build logs to your terminal
Go
3
star
37

idp4nathan

for t04glovern's eyes only
Go
2
star
38

awsdial

Go
2
star
39

gha-stats

Go
2
star
40

vpcjump

Helper tool for connecting to jumpboxes in AWS.
Ruby
2
star
41

lambda

Go
1
star
42

pandaboot

A libusb-based tool to copy bootloaders to the Pandaboard using USB.
C
1
star
43

demotemplate

1
star
44

ami2docker

Ruby
1
star
45

Protobuf.framework

Ruby
1
star
46

update-function-code-bug

Reproduction of AWS Lambda UpdateFunctionCode bug
Shell
1
star
47

aidansteele.github.io

Ruby
1
star
48

stepapi

TypeScript
1
star
49

learning-nim-aws

Nim
1
star
50

protobuf-mirror

Unofficial git mirror of the Google Protocol Buffers project
C++
1
star