• Stars
    star
    585
  • Rank 76,419 (Top 2 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created almost 4 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

llama -- A CLI for outsourcing computation to AWS Lambda

Llama is a tool for running UNIX commands inside of AWS Lambda. Its goal is to make it easy to outsource compute-heavy tasks to Lambda, with its enormous available parallelism, from your shell.

Most notably, llama includes llamacc, a drop-in replacement for gcc or clang which executes the compilation in the cloud, allowing for considerable speedups building large C or C++ software projects.

Lambda offers nearly-arbitrary parallelism and burst capacity for compute, making it, in principle, well-suited as a backend for interactive tasks that briefly require large amounts of compute. This idea has been explored in the ExCamera and gg papers, but is not widely accessible at present.

Performance numbers

Here are a few performance results from my testing demonstrating the current speedups achievable from llamacc:

project hardware local build local time llamacc build llamacc time Approx llamacc cost
Linux v5.10 defconfig Desktop (24-thread Ryzen 9 3900) make -j30 1:06 make -j100 0:42 $0.15
Linux v5.10 defconfig Simulated laptop (limited to 4 threads) make -j8 4:56 make -j100 1:26 $0.15
clang+LLVM, -O0 Desktop (24-thread Ryzen 9 3900) ninja -j30 5:33 ninja -j400 1:24 $0.49

As you can see, Llama is capable of speedups for large builds even on my large, powerful desktop system, and the advantage is more pronounced on smaller workstations.

Getting started

Dependencies

  • A Linux x86_64 machine. Llama only supports that platform for now. Cross-compilation should in theory be possible but is not implemented.
  • The Go compiler. Llama is tested on v1.16 but older versions may work.
  • An AWS account

Install llama

You'll need to install Llama from source. You can run

go install github.com/nelhage/llama/cmd/...@latest

or clone this repository and run

go install ./...

If you want to build C++, you'll want to symlink llamac++ to point at llamacc:

ln -nsf llamacc "$(dirname $(which llamacc))/llamac++"

Set up your AWS credentials

Llama needs access to your AWS credentials. You can provide them in the environment via AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY, but the recommended approach is to use ~/.aws/credentials, as used by. Llama will read keys out of either.

The account whose credentials you use must have sufficient permissions. The following should suffice:

  • AmazonEC2ContainerRegistryFullAccess
  • AmazonS3FullAccess
  • AWSCloudFormationFullAccess
  • AWSLambdaFullAccess
  • IAMFullAccess

Configure llama's AWS resources

Llama includes a CloudFormation template and a command which uses it to bootstrap all required resources. You can read the template to see what it's going to do.

Once your AWS credentials are ready, run

$ llama bootstrap

to create the required AWS resources. By default, it will prompt you for an AWS region to use; you can avoid the prompt using (e.g.) llama -region us-west-2 bootstrap.

If you get an error like

Creating cloudformation stack...
Stack created. Polling until completion...
Stack is in rollback: ROLLBACK_IN_PROGRESS. Something went wrong.
Stack status reason: The following resource(s) failed to create: [Repository, Bucket]. Rollback requested by user.

then you can go to the AWS web console, and find the relevant CloudFormation stack. The event log should have more useful errors explaining what went wrong. You will then need to delete the stack before retrying the bootstrap.

Set up a GCC image

You'll need to build a container with an appropriate version of GCC for llamacc to use.

If you are running Debian or Ubuntu, you can use scripts/build-gcc-image to automatically build a Debian image and Lambda function matching your local system:

$ scripts/build-gcc-image

If you want more control or are running another distribution, you can look at images/gcc-focal for an example Dockerfile to build a compiler package. You can build that or a similar image into a Lambda function using llama update-function like so:

$ llama update-function --create --build=images/gcc-focal gcc

Using llamacc

To use llamacc, run a build using make or a similar build system with a much higher -j concurrency than you normally would -- try 5-10x the number of local cores,, and using llamacc or llamac++ as your compiler. For example, you might invoke

$ make -j100 CC=llamacc CXX=llamac++

llamacc configuration

llamacc takes a number of configuration options from the environment, so that they're easy to pass through your build system. The currently supported options include.

Variable Meaning
LLAMACC_VERBOSE Print commands executed by llamacc
LLAMACC_LOCAL Run the compilation locally. Useful for e.g. CC=llamacc ./configure
LLAMACC_REMOTE_ASSEMBLE Assemble .S or .s files remotely, as well as C/C++.
LLAMACC_FUNCTION Override the name of the lambda function for the compiler
LLAMACC_LOCAL_CC Specifies the C compiler to delegate to locally, instead of using 'cc'
LLAMACC_LOCAL_CXX Specifies the C++ compiler to delegate to locally, instead of using 'c++'
LLAMACC_LOCAL_PREPROCESS Run the preprocessor locally and send preprocessed source text to the cloud, instead of individual headers. Uses less total compute but much more bandwidth; this can easily saturate your uplink on large builds.
LLAMACC_FULL_PREPROCESS Run the full preprocessor locally, not just #include processing. Disables use of GCC-specific -fdirectives-only
LLAMACC_BUILD_ID Assigns an ID to the build. Used for Llama's internal tracing support.
LLAMACC_FILTER_WARNINGS Filters the given comma-separated list of warnings out of all the compilations, e.g. LLAMACC_FILTER_WARNINGS=missing-include-dirs,packed-not-aligned.

It is strongly recommended that you use absolute paths if you set LLAMACC_LOCAL_CC and LLAMACC_LOCAL_CXX. Not all build systems will preserve $PATH all the way down to llamacc, so if you don't use absolute paths, you can get build failures that are difficult to diagnose.

Other features

llama invoke

You can use llama invoke to execute individual commands inside of Lambda. The syntax is llama invoke <function> <command> args.... <function> must be the name of a Lambda function using the Llama runtime. So, for instance, we can inspect the OS running inside our Lambda image:

$ llama invoke gcc uname -a
Linux 169.254.248.253 4.14.225-175.364.amzn2.x86_64 #1 SMP Mon Mar 22 22:06:01 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

If your function consumes files as input or output, you can use the -f and -o options to specify that files should be passed between the local and remote nodes. For instance:

$ llama invoke -f README.md:INPUT -o OUTPUT gcc sh -c 'sha256sum INPUT > OUTPUT'; cat OUTPUT
16c399c108bb783fc5c4529df4fecd0decb81bc0707096ebd981ab2b669fae20  INPUT

Note the use of LOCAL:REMOTE syntax to optionally specify different paths between the local and remote ends.

llama xargs

llama xargs provides an xargs-like interface for running commands in parallel in Lambda. Here's an example:

The optipng command compresses PNG files and otherwise optimizes them to be as small as possible, typically used in order to save bandwidth and speed load times on image assets. optipng is somewhat computationally expensive and compressing a large number of PNG files can be a slow operation. With llama, we can optimize a large of images by outsourcing the computation to lambda.

I prepared a directory full of 151 PNG images of the original PokƩmon, and benchmarked how long it took to optimize them using 8 concurrent processes on my desktop:

$ time ls -1 *.png | parallel -j 8 optipng {} -out optimized/{/}
[...]
real    0m45.090s
user    5m33.745s
sys     0m0.924s

Once we've prepared and optipng lambda function (we'll talk about setup in a later section), we can use llama to run the same computation in AWS Lambda:

$ time ls -1 *.png | llama xargs -logs -j 151 optipng optipng '{{.I .Line}}' -out '{{.O (printf "optimized/%s" .Line)}}'
real    0m16.024s
user    0m2.013s
sys     0m0.569s

We use llama xargs, which works a bit like xargs(1), but runs each input line as a separate command in Lambda. It also uses the Go template language to provide flexibility in substitutions, and offers the special .Input and .Output methods (.I and .O for short) to mark files to be passed back and forth between the local environment and Lambda.

Lambda's CPUs are slower than my desktop and the network operations have overhead, and so we don't see anywhere near a full 151/8 speedup. However, the additional parallelism still nets us a 3x improvement in real-time latency. Note also the vastly decreased user time, demonstrating that the CPU-intensive work has been offloaded, freeing up local compute resources for interactive applications or other use cases.

This operation consumed about 700 CPU-seconds in Lambda. I configured optipng to have 1792MB of memory, which is the point at which lambda allocates a full vCPU to the process. That comes out to about 1254400 MB-seconds of usage, or about $0.017 assuming I'm already out of the Lambda free tier.

Managing Llama functions

The llama runtime is designed to make it easy to bridge arbitrary images into Lambda. You can look at images/optipng/Dockerfile in this repository for a well-commented example explaining how you can wrap an arbitrary image inside of Lambda for use by Llama.

Once you have a Dockerfile or a Docker image, you can use llama update-function to upload it to ECR and manage the associated Lambda function. For instance, we could build optipng for the above example like so:

$ llama update-function --create --build=images/optipng optipng

When specifying the memory size for your functions, note that Lambda assigns CPU resources to functions based on their memory allocation. At 1,769 MB, your function will have the equivalent of one full core.

Other notes

Inspiration

Llama is in large part inspired by gg, a tool for outsourcing builds to Lambda. Llama is a much simpler tool but shares some of the same ideas and is inspired by a very similar vision of using Lambda as high-concurrency burst computation for interactive uses.

More Repositories

1

reptyr

Reparent a running program to a new terminal
C
5,339
star
2

gojit

JIT code-generation in Go!
Go
338
star
3

rules_boost

bazel build rules to use boost in bazel projects
C++
287
star
4

ministrace

A minimal toy implementation of strace(1)
C
161
star
5

reverse-android

Reverse-engineering tools for Android applications
Emacs Lisp
55
star
6

taktician

An implementation of and AI for the game of Tak
Go
55
star
7

elisp

nelhage's emacs configuration
Emacs Lisp
47
star
8

bemu

A just-in-time compiler for MIT 6.004's "Beta" processor.
C++
36
star
9

virtunoid

My KVM breakout code from my DEFCON/Black Hat 2011 presentation
C
35
star
10

transformer-rs

A sketch of a Transformer in Rust for a blog post
Rust
28
star
11

ultimattt

A Rust Implementation of Ultimate Tic Tac Toe
Rust
23
star
12

crossme

A collaborative crossword-puzzle solver written on Meteor
JavaScript
23
star
13

util-scripts

~/bin/
Shell
14
star
14

pw

My personal GPG-based password manager
Go
13
star
15

nullderef

A module for playing with kernel NULL pointer dereferences
C
11
star
16

iron-blogger

Code to run the Iron Blogger event.
Python
9
star
17

re2

A git mirror of Russ Cox's re2 regular-expression library
C++
9
star
18

s3multiget

Some benchmarks of S3 multigets
Go
9
star
19

aoc2023

Advent of Code 2023
C++
9
star
20

x11-proxy

8
star
21

git-merge-rename

A Dropbox-inspired git merge strategy that never fails
Shell
8
star
22

jsbeta

An emulator for 6.004's Beta processor for the browser.
JavaScript
8
star
23

plv8js

Fork of http://code.google.com/p/plv8js
C++
8
star
24

github-downloader

Scripts to download all of github. Or at least much of it.
Python
7
star
25

librstpreload

A LD_PRELOAD library to make any app support Random Standard Time.
C
6
star
26

go.cli

Some utilities for writing CLIs in go.
Go
6
star
27

ghostscript-afl

fuzzing ghostscript with AFL
C
5
star
28

kernel-workshop

demo kernel modules for a workshop at recurse center
C
5
star
29

blog.nelhage.com

blog.nelhage.com
JavaScript
5
star
30

tf-experiments

some experiments with tensorflow and tf on ec2
Python
5
star
31

rules_fuzzer

bazel+libfuzzer
Python
4
star
32

barnowl-zstatus

Z-Status module for BarnOwl
Perl
4
star
33

check-plus

A simple EDSL for testing C code using Check
C
4
star
34

flnv

nelhage's experimental toy scheme implementation
C
3
star
35

chef-config-nelhage

chef config for my personal infrastructure
Python
3
star
36

barnowl-devutils

DevUtils module for BarnOwl
Perl
3
star
37

barnowl-alias

Alias module for BarnOwl
Perl
2
star
38

coop

A Coq playground
Coq
2
star
39

iron-gollum

Rust
2
star
40

authmitedu

Code for auth.mit.edu OpenID provider
Perl
2
star
41

gollum

A toy typed lambda-calculus
Go
2
star
42

mitua

2
star
43

barnowl-modules

Superrepo for my BarnOwl modules
2
star
44

data-sexpression

A perl S-Expression parser
Perl
2
star
45

snb

Sexy Nerd Bot - A bot for the MIT prefrosh chat room
2
star
46

accidentallyquadratic

I accidentally an nĀ²
Python
2
star
47

barnowl-vt-asedeno

VT_ASedeno style for BarnOwl
Perl
2
star
48

bazel_git_repositories

Implementation of bazel `git_repository` remotes using Skylark
Python
2
star
49

hszephyr

Haskell libzephyr bindings
Haskell
2
star
50

claude

Scrappy CLI for Claude
Go
2
star
51

duckdb-parquet-bugs

Parquet files and generators for some DuckDB parquet bugs
Go
2
star
52

HIBP-reader

reader for HIPB sha files
Python
1
star
53

Ascension

Jifty webapp to track progress in nethack
Perl
1
star
54

arduinoproject

aggregator of random arduino projects
Assembly
1
star
55

tlaplus-sandbox

toying with TLA+
TLA
1
star
56

minimax.dev

Hugo site for https://minimax.dev/
HTML
1
star
57

brainfuck

Don't you have your own repository of brainfuck programs?
Brainfuck
1
star
58

magit

A magit fork with fixes/changes/hacks I've found useful.
Emacs Lisp
1
star
59

iobench

Some mostly-one-off IO benchmark scripts
C++
1
star
60

party-zephyr

A partychat<->Zephyr bridge
Python
1
star
61

proc-maps-bench

some benchmarking experiments related to /proc/$pid/maps
C++
1
star
62

godis86

golang wrapper for udis86
C
1
star
63

mode13h

Some code I wrote back in high school for doing graphics using VGA Mode 13h
C
1
star
64

mongod-tests

Framework for writing mongod tests and reproducers
Python
1
star
65

haskell-hunt

Various Haskell utilities to help with solving Mystery Hunt puzzles
Haskell
1
star
66

langer

C
1
star
67

xformer

Some toy transformer implementations
Python
1
star
68

rubik

Rubik's Cube implementation and experiments
C++
1
star
69

aoc2021

Advent of Code 2021
Julia
1
star
70

asciisnowmanforyou.com

http://asciisnowmanforyou.com
HTML
1
star
71

Text-Index-Database

An old attempt at writing my own full-text indexing solution
Perl
1
star