If you want to contribute, see CONTRIBUTING.md
.
This is the code that fuels https://wewerewondering.com/, a website that is aimed at facilitating live Q&A sessions. To use it, just go to that URL and click "Create Event". Then, click "Share Event" and share the URL that just got copied to your clipboard to anyone you want to be able to ask questions. You'll see them come in live in the host view. You can share the host view by copy-pasting the URL in your browser address bar.
What it provides:
- Zero-hassle experience for you and your audience.
- Audience question voting.
- List of already-answered questions.
- Ability to hide questions.
What it doesn't provide:
- Protection against malicious double-voting.
- Live question feed for the audience (it is ~10s out-of-date).
- Long-lived Q&A sessions -- questions go away after 30 days.
If you're curious about the technology behind the site, it's all run on AWS. Here's the rough architecture behind the scenes:
Account.
I've set up an AWS Organization for my personal AWS account. In that organization, I've created a dedicated AWS account that holds all the infrastructure for wewerewondering.com. That way, at least in theory, it's cleanly separated from everything else, and could even be given away to elsewhere should that become relevant.
Domain.
The domain is registered with Hover, my registrar of choice for no particularly good reason. The nameservers are set to point at Route 53, which hold a single public hosted zone. It has MX records and SPF pointing to ImprovMX (which is great btw), A and AAAA records that use "aliasing" to point at the CloudFront distribution for the site (see below). Finally, it has a CNAME record used for domain verification for AWS Certificate Manager.
The process for setting up the cert was a little weird. First, the
certificate must be in us-east-1
to work with CloudFront for
reasons. Second, the CNAME record for domain verification wasn't
auto-added. Instead, I had to go into the Certificate Manager control
panel for the domain, and click a button named "Create records in Route
53". Not too bad, but wasn't immediately obvious. Once I did that
though, verification went through just fine.
CDN.
The main entry point for the site is AWS CloudFront. I have a single
"distribution", and the Route 53 A/AAAA entries are pointed at that one
distribution's CloudFront domain name. The distribution also has
wewerewondering.com configured as an alternate domain name, and is
configured to use the Certificate Manager domain from earlier and the
most up-to-date TLS configuration. The distribution has "standard
logging" (to S3) enabled for now, and has a "default root object" of
index.html
(more on that later).
CloudFront ties "behaviors" to "origins". Behaviors are ~= routes
and origins are ~= backends. There are two behaviors: the default route
and the /api
route. There are two origins: S3 and API Gateway.
You get three internet points if you can guess which behavior connects
to which origin.
Static components. The default route (behavior) is set up to send
requests to the S3 origin, which in turn just points at an S3 bucket
that holds the output of building the stuff in client/
. The behavior
redirects HTTP to HTTPS, only allows GET and HEAD requests, and uses the
CachingOptimized
caching policy which basically means it has a long
default timeout (1 day) and compression enabled. In S3, I've
specifically overridden the "metadata" for index.html
to set
cache-control to max-age=300
since it gets updated in-place (the
assets/
files have hashes in their names and can be cached forever).
In addition, it has the SecurityHeaderPolicy
response header policy to
set X-Frame-Options
and friends.
There's one non-obvious trick in use here to make the single-page app
approach work with "pretty" URLs that don't involve #
. Ultimately we
want URLs that the single-page app handles to all be routed to
index.html
rather than try to request, say, /event/foo
from S3.
There are multiple ways to achieve this. The one I went with was to
define a CloudFront function to rewrite request URLs that I then
associate with the "Viewer request" hook. It looks like this:
function handler(event) {
var req = event.request;
if (req.uri.startsWith('/event/')) {
req.uri = '/index.html';
}
return req;
}
I did it this way rather than using a custom error response because that also rewrites 404 errors from the API origin, which I don't want. Not to mention I wanted unhandled URLs to still give 404s. And I didn't want to use S3 Static Web Hosting (which allows you to set up conditional redirects) because then CloudFront can't access S3 "natively" and will instead redirect to the bucket and require it to be publicly accessible.
Another modification I made to the defaults was to slightly modify the S3 bucket policy compared to the one CloudFlare recommends in order to allow LIST requests so that you get 404s instead of 403s. The part of the policy I used was:
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::wewerewondering-static",
"arn:aws:s3:::wewerewondering-static/*"
],
The /api
endpoints. The behavior for the /api
URLs is defined for
the path /api/*
, is configured to allow all HTTP methods but only
HTTPS, and also uses the SecurityHeaderPolicy
response header
policy. For caching, I created my own policy that is basically
CachingOptimized
but has a default TTL of 1s, because if I fail to set
a cache header I'd rather things mostly keep working rather than
everything looking like nothing updates.
The origin for /api
is a custom origin that holds the "Invoke URL"
of the API Gateway API (and requires HTTPS). Which brings us to:
The API.
As previously mentioned, the API is a single AWS Lambda backed by the
Lambda Rust Runtime (see server/
for more details). But, it's hosted
through AWS' API Gateway service, mostly because it gives me
throttling, metrics, and logging out of the box. For more elaborate
services I'm sure the multi-stage and authorization bits come in handy
too, but I haven't made use of any of that. The site also uses the HTTP
API configuration because it's a) cheaper, b) simpler to set up, and c)
worked out of the box with the Lambda Rust Runtime, which the REST
API stuff didn't (for me at least). There are other differences, but
none that seemed compelling for this site's use-case.
All of the routes supported by the API implementation (in server/
) are
registered in API Gateway and are all pointed at the same Lambda. This
has the nice benefit that other routes won't even invoke the Lambda,
which (I assume) is cheaper. I've set up the $default
stage to have
fairly conservative throttling (for now) just to avoid any surprise
jumps in cost. It also has "Access logging" set up.
One thing noting about using API Gateway with the HTTP API is that the automatic dashboard it adds to CloudWatch doesn't work because it expects the metrics from the REST API, which are named differently from the ones used by the HTTP API. The (annoying) fix was to copy the automatic dashboard over into a new (custom) dashboard and edit the source for every widget to replace
"ApiName", "wewerewondering"
with
"ApiId", "<the ID of the API Gateway API>"
and replace the metric names by the correct ones.
The Lambda itself is mostly just what cargo lambda deploy
sets up,
though I've specifically add RUST_LOG
as an environment variable to
get more verbose logs (for now). It's also set up to log to CloudWatch,
which I think happened more or less automatically. Crucially though, the
IAM role used to execute the Lambda is also granted read/write (but not
delete/admin) access to the database, like so:
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"dynamodb:BatchGetItem",
"dynamodb:PutItem",
"dynamodb:GetItem",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:UpdateItem"
],
"Resource": [
"arn:aws:dynamodb:*:<account id>:table/events",
"arn:aws:dynamodb:*:<account id>:table/questions",
"arn:aws:dynamodb:*:<account id>:table/questions/index/top"
]
}
The database.
The site uses DynamoDB as its storage backend, because frankly, that's
all it needs. And it's fairly fast and cheap if you can get away with
its limited feature set. There are two tables, events
and questions
,
both of which are set up to use on-demand provisioning. events
just
holds the ULID of an event, which is also the partition key (DynamoDB
doesn't have auto-increment integer primary keys because they don't
scale), the event's secret key, and its creation and auto-deletion
timestamp. questions
has:
- the question ULID (as the partition key)
- the event ULID
- the question text
- the question author (if given)
- the number of votes
- whether the question is answered
- whether the question is hidden
- creation and auto-deletion timestamps
The ULIDs, the timestamps, and the question text + author never change This is why the API to look up event info and question texts/authors is separated from looking up vote counts -- the former can have much longer cache time.
To allow querying questions for a given event and receive them in sorted
order, questions
also has a global secondary index called top
whose partition key is the event ULID and sort key votes
. That index
also projects out the "answered" and "hidden" fields so that a single
query to that index gives all the mutable state for an event's question
list (and can thus be queried with a single DynamoDB call by the
Lambda).
Metrics and Logging.
Scaling further.
Currently, everything is in us-east-1
. That's sad. CDN helps
(potentially a lot), but mainly for guests, and not when voting. It's
mostly because DynamoDB [global tables] do reconciliation-by-overwrite,
which doesn't work very well for counters. Could make it store every
vote separately and do a count, but that's sad. Alternatively, if we
assume that most guests are near the host, we could:
- Make
events
a global table (but notquestions
). - Have a separate
questions
in each region. - Add a
region
column toevents
which is set to the region that hosts the Lambda that serves the "create event" request. - Update the server code to always access
questions
in the region of the associated event.
We'd probably need to tweak CloudFlare (and maybe Route 53?) a little bit to make it to do geo-aware routing, but I think that's a thing it supports.
Notes for me
To deploy server:
cd server
cargo lambda build --release --arm64
cargo lambda deploy --env-var RUST_LOG=info,tower_http=debug,wewerewondering_api=trace --profile qa
To deploy client:
cd client
npm run build
aws --profile qa s3 sync --delete dist/ s3://wewerewondering-static