Time is money with serverless — learn how to avoid wait states
When you’re paying in 100ms increments for compute and millions of invocations only cost pennies, it can be overkill to worry too much about the execution length of your Lambda function. Until it isn’t.
One of the easiest ways to tighten your Lambda bill is to not pay for waiting — which means getting your arms around async. Paying for idle is bad.
The hangover of using “old world” computers where there’s a CPU bought and paid for leads to some “sunk cost” thinking — it doesn’t matter if tasks hang around with nothing to do because the CPU isn’t active anyway.
But in serverless, time is the biggest factor in the cost of an operation, so you have to look at your flow in a slightly different way.
Many in the serverless community warn against asynchronous operations in Lambda and with good reason — but they’re primarily warning about monolithic functions where devs have lifted and shifted from the old world to the new.
It’s common to see some sort of state management or controller function calling a sequence of other functions, leaving the calling function to sit around and twiddle its thumbs with nothing to do:
In Node, this has become more common since the arrival of async/await which makes it trivial to turn asynchronous code into sequential operations. With this approach, you’re paying twice — once for each ‘step’ function and again for the waiting, controlling function. It doesn’t matter much for infrequently-used functions, but in my experience it’s bad practice for two reasons:
- In the real world, these controller functions call other controller functions which call other nested controller functions, so you might have many functions all waiting for a return value, all running at the same time and ringing up your AWS bill.
- When anything scales up meaningfully, it becomes more noticeable and people start making claims about Lambda not being as cheap as they thought.
There are a couple of alternatives to this approach. The first is to break apart these steps by using events in the native AWS services they wait on. After the DynamoDB record is written, using the stream event to kick off the S3 task, and then using the ObjectCreated event in S3 to start the Lambda that sends the email.
In this version you’re not paying for the time it takes DynamoDB or S3 to finish doing their thing:
Devs are starting to realize that us serverless kids are in the business of writing glue but it’s still not common to see this design. It’s partly because back in the ‘old world’ before serverless, it’s a travesty to have three separate applications doing something so trivial, especially when we’ve already paid for the CPU. But when we’re paying for time, having more functions running isn’t our problem. It’s also much more scalable.
The second alternative is AWS Step Functions which picks up the role of the state management or controller task. It’s a really nifty service that solves this entire problem and you should use it for everything — except it’s incredibly expensive. Although $0.025 per 1,000 state transitions doesn’t sound like much, it can quickly add up if you have more complex workflows.
Step Functions is fantastic for longer-lived workflows like order management but usually overkill for managing a handful of short-lived steps to move data around AWS. In practice, the cost keeps me away but the moment they make it a free service (#awswishlist!), state management in serverless will become a breeze.
Good Async: Multiple records in the event source
When Lambda gives you an event source, you usually get a single event record but not always. If you are iterating sequentially over the records, this can have a surprising impact on the duration of your Lambda function.
Fortunately, in Node it’s trivial to process those records in parallel. In this example, our processRecord Function sets a timer for 1 second and there are 3 records in the test event, yet the overall execution time was only 1048ms not ~3000ms:
Why? The processRecord function is called for every record (almost) simultaneously and Promise.all waits for all the promises to be resolved, effectively allowing you to run all the requests in parallel. In practice, you must account for rejected promises.
Of course, you might not be able to run these in parallel in some cases but where you can this creates a major time-saver. Good Async creates win, and the Promise.all / Records.map approach is a simple answer to this problem.
Good Async: Multiple concurrent tasks
If your function has to check a number of things before executing, it’s worth seeing if any can be run concurrently or if your code is behaving as expected. For example, my trading function allowedToTrade checks a restricted stock list, the user’s account balance, and if the market is open, before approving or denying:
I execute my trade and … miss the market. In this function, it performs one check at a time in sequence when it could be much faster. Each of the tasks is independent and can happen concurrently so with a small change, it will run more quickly:
- In serverless, time is money — we should optimize for time used not the number of functions created. The goal is to avoid waiting and avoid paying when Lambda is doing nothing.
- Async is easy in NodeJS — you can move from flowchart to code with just a few lines of Node, and the async/await keywords eliminate callback hell.
- Bad async creates controller functions that are waiting for AWS services to do something. When you are waiting on AWS, you can often use events or Step Functions to find a better (and cheaper) way.
- Good async can make substantial differences to execution time by executing in parallel where it makes sense. Common examples include multiple event records and multiple concurrent tasks.