• Stars
    star
    450
  • Rank 93,830 (Top 2 %)
  • Language
    Python
  • License
    Other
  • Created about 8 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A proxy for AWS's metadata service that gives out scoped IAM credentials from STS

metadataproxy

The metadataproxy is used to allow containers to acquire IAM roles. By metadata we mean EC2 instance meta data which is normally available to EC2 instances. This proxy exposes the meta data to containers inside or outside of EC2 hosts, allowing you to provide scoped IAM roles to individual containers, rather than giving them the full IAM permissions of an IAM role or IAM user.

Installation

From inside of the repo run the following commands:

mkdir -p /srv/metadataproxy
cd /srv/metadataproxy
virtualenv venv
source venv/bin/activate
pip install metadataproxy
deactivate

Configuration

Modes of operation

See the settings file for specific configuration options.

The metadataproxy has two basic modes of operation:

  1. Running in AWS where it simply proxies most routes to the real metadata service.
  2. Running outside of AWS where it mocks out most routes.

To enable mocking, use the environment variable:

export MOCK_API=true

AWS credentials

metadataproxy relies on boto configuration for its AWS credentials. If metadata IAM credentials are available, it will use this. Otherwise, you'll need to use .aws/credentials, .boto, or environment variables to specify the IAM credentials before the service is started.

Role assumption

For IAM routes, the metadataproxy will use STS to assume roles for containers. To do so it takes the incoming IP address of metadata requests and finds the running docker container associated with the IP address. It uses the value of the container's IAM_ROLE environment variable as the role it will assume. It then assumes the role and gives back STS credentials in the metadata response.

STS-attained credentials are cached and automatically rotated as they expire.

Container-specific roles

To specify the role of a container, simply launch it with the IAM_ROLE environment variable set to the IAM role you wish the container to run with.

If the trust policy for the role requires an ExternalId, you can set this using the IAM_EXTERNAL_ID environment variable. This is most frequently used with cross-account role access scenarios. For more information on when you should use an External ID for your roles, see:

http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html

docker run -e IAM_ROLE=my-role ubuntu:14.04
docker run -e IAM_ROLE=their-role@another-account -e IAM_EXTERNAL_ID=random-unique-string ubuntu:14.04

Configurable Behavior

There are a number of environment variables that can be set to tune metadata proxy's behavior. They can either be exported by the start script, or set via docker environment variables.

Variable Type Default Description
DEFAULT_ROLE String Role to use if IAM_ROLE is not set in a container's environment. If unset the container will get no IAM credentials.
DEFAULT_ACCOUNT_ID String The default account ID to assume roles in, if IAM_ROLE does not contain account information. If unset, metadataproxy will attempt to lookup role ARNs using iam:GetRole.
ROLE_SESSION_KEY String Optional key in container labels or environment variables to use for role session name. Prefix with Labels: or Env: respectively to indicate where key should be found. Useful to pass through metadata such as a CI job ID or launching user for audit purposes, as the role session name is included in the ARN that appears in access logs.
DEBUG Boolean False Enable debug mode. You should not do this in production as it will leak IAM credentials into your logs
DOCKER_URL String unix://var/run/docker.sock Url of the docker daemon. The default is to access docker via its socket.
METADATA_URL String http://169.254.169.254 URL of the metadata service. Default is the normal location of the metadata service in AWS.
MOCK_API Boolean False Whether or not to mock all metadata endpoints. If True, mocked data will be returned to callers. If False, all endpoints except for IAM endpoints will be proxied through to the real metadata service.
MOCKED_INSTANCE_ID String mockedid When mocking the API, use the following instance id in returned data.
AWS_ACCOUNT_MAP JSON String {} A mapping of account names to account IDs. This allows you to use user-friendly names instead of account IDs in IAM_ROLE environment variable values.
AWS_REGION String AWS Region for the STS endpoint allow you to call region based endpoint instead of global one. AWS STS region endpoints.
ROLE_EXPIRATION_THRESHOLD Integer 15 The threshold before credentials expire in minutes at which metadataproxy will attempt to load new credentials.
ROLE_MAPPING_FILE Path String A json file that has a dict mapping of IP addresses to role names. Can be used if docker networking has been disabled and you are managing IP addressing for containers through another process.
ROLE_REVERSE_LOOKUP Boolean False Enable performing a reverse lookup of incoming IP addresses to match containers by hostname. Useful if you've disabled networking in docker, but set hostnames for containers in /etc/hosts or DNS.
HOSTNAME_MATCH_REGEX Regex String ^.*$ Limit reverse lookup container matching to hostnames that match the specified pattern.
PATCH_ECS_ALLOWED_HOSTS String Patch botocore's allowed hosts for ContainerMetadataFetcher to support aws-vault's --ecs-server option. This will inject the provided host into the allowed addresses botocore will allow for the AWS_CONTAINER_CREDENTIALS_FULL_URI environment.

Default Roles

When no role is matched, metadataproxy will use the role specified in the DEFAULT_ROLE metadataproxy environment variable. If no DEFAULT_ROLE is specified as a fallback, then your docker container without an IAM_ROLE environment variable will fail to retrieve credentials.

Role Formats

The following are all supported formats for specifying roles:

  • By Role:

    IAM_ROLE=my-role
  • By Role@AccountId

    IAM_ROLE=my-role@012345678910
  • By ARN:

    IAM_ROLE=arn:aws:iam::012345678910:role/my-role

Role structure

A useful way to deploy this metadataproxy is with a two-tier role structure:

  1. The first tier is the EC2 service role for the instances running your containers. Call it DockerHostRole. Your instances must be launched with a policy that assigns this role.

  2. The second tier is the role that each container will use. These roles must trust your own account ("Role for Cross-Account Access" in AWS terms). Call it ContainerRole1.

  3. metadataproxy needs to query and assume the container role. So the DockerHostRole policy must permit this for each container role. For example:

    "Statement": [ {
        "Effect": "Allow",
        "Action": [
            "iam:GetRole",
            "sts:AssumeRole"
        ],
        "Resource": [
            "arn:aws:iam::012345678901:role/ContainerRole1",
            "arn:aws:iam::012345678901:role/ContainerRole2"
        ]
    } ]
    
  4. Now customize ContainerRole1 & friends as you like

Note: The ContainerRole1 role should have a trust relationship that allows it to be assumed by the user which is associated to the host machine running the sts:AssumeRole command. An example trust relationship for ContainRole1 may look like:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::012345678901:root",
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Routing container traffic to metadataproxy

Using iptables, we can forward traffic meant to 169.254.169.254 from docker0 to the metadataproxy. The following example assumes the metadataproxy is run on the host, and not in a container:

/sbin/iptables \
  --append PREROUTING \
  --destination 169.254.169.254 \
  --protocol tcp \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
  --to-destination 127.0.0.1:8000 \
  --wait

If you'd like to start the metadataproxy in a container, it's recommended to use host-only networking. Also, it's necessary to volume mount in the docker socket, as metadataproxy must be able to interact with docker.

Be aware that non-host-mode containers will not be able to contact 127.0.0.1 in the host network stack. As an alternative, you can use the meta-data service to find the local address. In this case, you probably want to restrict proxy access to the docker0 interface!

LOCAL_IPV4=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)

/sbin/iptables \
  --append PREROUTING \
  --destination 169.254.169.254 \
  --protocol tcp \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
  --to-destination $LOCAL_IPV4:8000 \
  --wait

/sbin/iptables \
  --wait \
  --insert INPUT 1 \
  --protocol tcp \
  --dport 80 \
  \! \
  --in-interface docker0 \
  --jump DROP

Run metadataproxy without docker

In the following we assume _my_config_ is a bash file with exports for all of the necessary settings discussed in the configuration section.

source my_config
cd /srv/metadataproxy
source venv/bin/activate
gunicorn metadataproxy:app --workers=2 -k gevent

Run metadataproxy with docker

For production purposes, you'll want to kick up a container to run. You can build one with the included Dockerfile. To run, do something like:

docker run --net=host \
    -v /var/run/docker.sock:/var/run/docker.sock \
    lyft/metadataproxy

gunicorn settings

The following environment variables can be set to configure gunicorn (defaults are set in the examples):

# Change the IP address the gunicorn worker is listening on. You likely want to
# leave this as the default
HOST=0.0.0.0

# Change the port the gunicorn worker is listening on.
PORT=8000

# Change the number of worker processes gunicorn will run with. The default is
# 1, which is likely enough since metadataproxy is using gevent and its work is
# completely IO bound. Increasing the number of workers will likely make your
# in-memory cache less efficient
WORKERS=1

# Enable debug mode (you should not do this in production as it will leak IAM
# credentials into your logs)
DEBUG=False

Contributing

Code of conduct

This project is governed by Lyft's code of conduct. All contributors and participants agree to abide by its terms.

Sign the Contributor License Agreement (CLA)

We require a CLA for code contributions, so before we can accept a pull request we need to have a signed CLA. Please visit our CLA service follow the instructions to sign the CLA.

File issues in Github

In general all enhancements or bugs should be tracked via github issues before PRs are submitted. We don't require them, but it'll help us plan and track.

When submitting bugs through issues, please try to be as descriptive as possible. It'll make it easier and quicker for everyone if the developers can easily reproduce your bug.

Submit pull requests

Our only method of accepting code changes is through github pull requests.

More Repositories

1

cartography

Cartography is a Python tool that consolidates infrastructure assets and the relationships between them in an intuitive graph view powered by a Neo4j database.
Python
2,856
star
2

scissors

βœ‚ Android image cropping library
Java
1,841
star
3

confidant

Confidant: your secret keeper. https://lyft.github.io/confidant
Python
1,781
star
4

clutch

Extensible platform for infrastructure management
Go
1,610
star
5

react-javascript-to-typescript-transform

Convert React JavaScript code to TypeScript with proper typing
TypeScript
1,574
star
6

mapper

A JSON deserialization library for Swift
Swift
1,174
star
7

scoop

🍦 micro framework for building view based modular Android applications.
Java
1,036
star
8

Hammer

iOS touch synthesis library
Swift
642
star
9

protoc-gen-star

protoc plugin library for efficient proto-based code generation
Go
563
star
10

xiblint

A tool for linting storyboard and xib files
Python
522
star
11

flinkk8soperator

Kubernetes operator that provides control plane for managing Apache Flink applications
Go
508
star
12

domic

Reactive Virtual DOM for Android.
Kotlin
482
star
13

cni-ipvlan-vpc-k8s

AWS VPC Kubernetes CNI driver using IPvlan
Go
357
star
14

nuscenes-devkit

Devkit for the public 2019 Lyft Level 5 AV Dataset (fork of https://github.com/nutonomy/nuscenes-devkit)
Jupyter Notebook
352
star
15

presto-gateway

A load balancer / proxy / gateway for prestodb
JavaScript
337
star
16

toasted-marshmallow

S'More speed for Marshmallow
Python
297
star
17

Kronos-Android

An Open Source Kotlin SNTP library
Kotlin
239
star
18

coloralgorithm

Javacript function to produce color sets
JavaScript
225
star
19

awspricing

Python library for AWS pricing.
Python
201
star
20

discovery

This service provides a REST interface for querying for the list of hosts that belong to all microservices.
Python
185
star
21

linty_fresh

✨ Surface lint errors during code review
Python
183
star
22

universal-async-component

React Universal Async Component that works with server side rendering
TypeScript
180
star
23

swift-index-store

Library to read from Swift / clang source code indexes
C++
123
star
24

python-blessclient

Python client for fetching BLESS certificates
Python
112
star
25

goruntime

Go client for Runtime application level feature flags and configuration
Go
84
star
26

omnibot

One slackbot to rule them all
Python
80
star
27

lyft-android-sdk

Public Lyft SDK for Android
Java
72
star
28

high-entropy-string

A library for classifying strings as potential secrets.
Python
60
star
29

gostats

Go client for Stats
Go
56
star
30

pynamodb-attributes

Common attributes for PynamoDB
Python
52
star
31

bandit-high-entropy-string

A high entropy string plugin for OpenStack's bandit project
Python
48
star
32

Lyft-iOS-sdk

Public Lyft SDK for iOS
Swift
43
star
33

python-kmsauth

A python library for reusing KMS for your own authentication and authorization
Python
38
star
34

opsreview

Compile a report of recent PagerDuty alerts for a single escalation policy.
Python
29
star
35

atlantis

Terraform automation for GitHub PRs (private fork of runatlantis/atlantis)
Go
18
star
36

lyft-node-sdk

Node SDK for the Lyft Public API
JavaScript
16
star
37

fake_sqs

An implementation of a local SQS service.
Ruby
15
star
38

lyft.github.io

This is code for oss.lyft.com website.
TypeScript
14
star
39

dockernetes

Run kubernetes inside a docker container.
Dockerfile
12
star
40

python-confidant-client

Client library and CLI for Confidant
Python
11
star
41

kustomizer

A container for running k8s kustomize
Shell
11
star
42

collectd-statsd

collectd plugin to write to statsd
Python
10
star
43

lyft-web-button

Build an actionable, Lyft-branded button for your website
JavaScript
9
star
44

dynamodb-hive-serde

Hive Deserializer for DynamoDB backup data format
Java
8
star
45

airflow

Lyft fork of apache/airflow
Python
8
star
46

syx

Python 2 and 3 compatibility library from Lyft.
Python
7
star
47

lyft-node-samples

Sample applications using Node.js for the Lyft Public API
JavaScript
7
star
48

android-puzzlers

Android puzzles for Lyft Talks and more.
Java
6
star
49

lyft-go-sdk

Go SDK for the Lyft Public API
Go
6
star
50

lyft-django-sample

An API integration example using Django and social-auth.
Python
5
star
51

python-omnibot-receiver

Library for use by services that receive messages from omnibot.
Python
4
star
52

osscla

Open Source Contributor License Agreement service
Python
4
star
53

code-of-conduct

Code of Conduct for Lyft's open source projects
3
star
54

CLA

Contributor License Agreement (CLA) for Lyft's open source projects
3
star
55

awseipext

AWS Lambda that extends the EC2 Elastic IP API
Python
3
star
56

heroku-buildpack-php

Shell
2
star
57

lyft-go-samples

Sample applications in Go for the Lyft Public API
Go
1
star
58

flask-pystatsd

flask extension to simplify the use of the pystatsd library
Python
1
star
59

eventbot

A slackbot to help organize events
Python
1
star