cdkbase
← All posts

AWS Lambda Explained: Pricing, Cold Starts, and When to Use It

7 min readAWS LambdaServerlessAWSCold Starts

AWS Lambda is the compute layer that makes “serverless” serverless: you ship a function, AWS runs it on demand, and you pay only while it’s actually executing. Here’s what it is, how the pricing really works, why cold starts happen, and where Lambda fits — and doesn’t.

What AWS Lambda is

Lambda runs your code in response to events — an HTTP request, a queue message, a schedule — without any server for you to provision or manage. You hand AWS a function and a bit of config (memory, timeout); it handles the machine, the OS, scaling, and patching. When ten thousand requests arrive at once, Lambda runs many copies in parallel; when none arrive, it runs nothing and charges nothing.

That last property is the whole point. There’s no instance sitting idle between requests, which is why a Lambda-based app scales to zero and underpins the near-zero idle cost of a serverless stack.

How the pricing actually works

Lambda bills on two axes, and understanding both demystifies the bill:

There’s also a standing free tier — on the order of a million requests and 400,000 GB-seconds a month — that doesn’t expire after your first year. For a low-traffic app, that free tier alone often covers the entire compute bill, which is why early-stage serverless costs round to the price of the domain’s hosted zone. (We break the full bill down in what it costs to run a serverless SaaS.)

Cold starts, explained

The most-discussed Lambda quirk. When a request arrives and no warm copy of your function is available, Lambda has to create a new execution environment — download your code, start the runtime, run your initialization — before handling the request. That first-request latency is a cold start. Subsequent requests reuse the warm environment and skip it.

What actually moves the needle on cold starts:

For the large majority of SaaS workloads, cold starts on a lean function are a non-issue — a few hundred milliseconds on an occasional first request, invisible behind a CDN for static assets.

Where Lambda fits

Lambda shines for spiky, request-driven, or event-driven work: HTTP APIs, server-rendered pages, webhook handlers, scheduled jobs, queue and stream processors, glue between AWS services. Anything that’s bursty or mostly idle is a near-perfect fit, because you’re billed for the bursts and nothing for the idle.

Where it doesn’t

How it fits the SaaS stack

In a serverless SaaS, Lambda is the workhorse behind nearly everything that runs code: the Hono API, the server-rendered web app, the billing webhook handler, and background jobs — each a function behind an API Gateway HTTP API, with CloudFront in front for caching and static assets. Pair that with Aurora DSQL for data and you have a stack with no always-on compute at all, which is exactly why idle costs almost nothing.

The bottom line

Lambda trades a little control — runtime limits, the occasional cold start — for an enormous operational win: no servers, automatic scaling, and a bill that tracks usage instead of capacity. For the request-driven workloads most SaaS products are built from, that trade is overwhelmingly worth it, and it’s why Lambda is the default compute layer in the stack we build on.

Skip the wiring and start from a working stack

cdkbase is a fork-ready AWS serverless template that ships everything in this article — CDK infrastructure, Cognito auth, Aurora DSQL, a Hono API, Stripe billing, and web/SPA/mobile frontends — already wired together and built for Claude Code. See pricing or read the getting-started guide.