AWS is a Pain Without a Big Budget

August 31, 2018

In my last website post I talked about my plans for setting up website notifications on AWS Lambda and DynamoDB. The idea is that a function on AWS Lambda would get called when the site had an update, which would fetch all the site data, diff it against the previous state, and determine which pages actually changed. Those changes would get saved to AWS DynamoDB, which has a streaming feature that other AWS Lambda functions can be triggered by for each event. Multiple Lambda functions (one for each service) would get those updates and fire off whatever integration was necessary for each service.

This would put the burden of running the service and hosting the data to Amazon’s ops crew, which is undoubtedly better than what I would have set up. As long as I stayed within the limits of the AWS free tier, which looked pretty decent, I would be able to run this in perpetuity, right?

Of course, nothing survives an encounter with free. Amazon’s AWS platform is clearly not designed for hobbyists; the free tier is really just there to get you running while you set up a proper product together without scaring you off with bills yet. It’s there to have its limits exceeded. Which is what I discovered as I tried to use DynamoDB.

On the free tier page, Amazon says you get 25 GB of storage free. I’m storing a few hundred JSON objects that are in the 10KB-50KB range, so well within that limit, right? Well, under the fine print, the other limits they have are “25 Units of Read Capacity” and “25 Units of Write Capacity”.

I charged ahead and started building the app, only to get errors about insufficient units of provisioned capacity, which is vocabulary that definitely wasn’t clear to me. There’s nothing listed on the free tier page that says what a “unit” actually is; my initial thought was a concurrent database connection. You have to dig down to the DynamoDB pricing page to find this:

One read capacity unit provides up to two reads per second, enough for 5.2 million reads per month. The first 25 read capacity units per month are free, and prices start from $0.09 per ready capacity unit-month thereafter.

So that means staying in the free tier limits me to 50 reads per second, right? So I set out to build a thing that throttles those reads, only to continue getting those provisioning limits. I tried throttling further, to no avail. What was going on? I ultimately stumbled onto this StackOverflow answer which spelled out a bit more what was happening; throughput was not only limited by number of reads, but by object size. You have to decide what you want this number to be upfront; it’s not a calculation of how much data you are using, it’s how much you want to use. DynamoDB’s usage model is clearly designed to be controlled by startup people who start getting 500 errors from their backend and want to fix it by just cranking that provisioned throughput number up (which, of course, increases a company’s AWS budget). It’s not meant for tinkerers.

At some point in every project, I come to a point where I have to decide whether to keep playing with the interesting but frustrating toy or reverting back to a more familiar approach and just finishing the dang thing. It was at this point that I abandoned AWS, Lambda, and DynamoDB in favor of my traditional stack for building server apps: Node.js, Express, and SQL (in this case, SQLite, but I may switch to PostgreSQL at some point).

I didn’t abandon the shiny, interesting paradigms altogether. I really liked the idea of building the app as a set of weakly coupled modules that worked together through pub/sub model. Instead of building a monolithic server like I usually do, I used Lerna to create a bunch of mini-modules that snap together on top of a storage system (using Knex.js on top of SQLite or PostgreSQL) and a basic pub/sub system. Once that basic system was in place, the code could largely just be copied over from one to the other, with only a few modifications to storage to facilitate the change. Thanks to TypeScript and Visual Studio Code, this was really easy to port and I have some reasonable confidence that basic type errors haven’t slipped through.

The schema is pretty simple. The first table pages is an identifier of a post permalink (e.g. /2018/08/30/aws-is-a-pain-without-a-big-budget/ for this page) and a JSON object describing everything on that page. It’s basically the current source of truth. A second table changes has a column for a change ID (just an autoincrementing integer), a timestamp, a type enum (either insert, update, or delete), and a column that show the new and old state, depending on what type of change it is. The third table is just a namespaced key/value store for whatever metadata each plugin might want. This is most useful for storing the last seen change ID for a given service. For example, when the plugin for Telegram gets a signal from the pub/sub system that there was a new change, it can get all the changes since its last seen change ID, filter down to the ones that are inserts, and create Telegram messages for each post, before writing the newest change ID back to the metadata table.

So far, this system is working and a proof of concept for Telegram does exist. Unfortunately this does come with the downside that it has slowed down this setup considerably, and I now have to host it somewhere. I will probably just turn the whole thing into a Docker container and run it on one of my servers, which I can deploy automatically through continuous integration on my GitLab server. The nice thing about this though is that it means I’ll have access to a full system and all of the resources that come along with it for future modifications, which will be really handy when adding more server capabilities like Micropub. But that’s a story for next time.