Testland
Browse all skills & agents

cold-start-budget-reference

Pure-reference catalog of cold-start budgets across serverless runtimes. Covers AWS Lambda's three-phase cold start (Init: download+unzip+runtime-bootstrap; Init code: imports + module load; Invoke: handler execution), Cloudflare Workers' isolate model (sub-millisecond cold starts via V8 isolates per developers.cloudflare.com), Vercel Edge Runtime, Lambda SnapStart for JVM (snapshot-restore for Java), and provisioned-concurrency trade-offs. Includes per-runtime typical cold-start ranges and the testable behaviours each model creates. Use when designing latency budgets, choosing a runtime, or auditing cold-start variance in production.

cold-start-budget-reference

Overview

Per aws.amazon.com/blogs/compute on cold starts, Lambda's cold start has three phases:

  1. Init - download deployment package, unzip, bootstrap the runtime (Node/Python/Java/etc.).
  2. Init code - execute module-level imports + global setup.
  3. Invoke - the actual handler call.

Phases 1 + 2 are the "cold" part. Phase 3 is what runs every invocation (cold or warm).

When to use

  • Designing a latency budget for a Lambda / Workers / Edge function.
  • Investigating "p95 is fine but p99 is 5s."
  • Choosing a runtime - Cloudflare's isolate model is qualitatively different from Lambda containers.
  • Auditing provisioned-concurrency / SnapStart configurations.

Per-runtime cold-start budgets

Per AWS, Cloudflare, Vercel docs (typical ranges; bigger packages and bigger memory-class skew higher):

RuntimeTypical cold startArchitecture
AWS Lambda Node.js (256MB)200-700msContainer (Firecracker microVM)
AWS Lambda Python (256MB)250-800msContainer
AWS Lambda Java 11 (512MB, no SnapStart)1.5-6sContainer + JVM warmup
AWS Lambda Java 11 (512MB, SnapStart)100-300msSnapshot-restore per docs.aws.amazon.com/lambda
AWS Lambda .NET (1GB)1-3sContainer + .NET runtime
AWS Lambda Go (256MB)100-300msContainer; pre-compiled binary
AWS Lambda Rust (256MB)50-200msContainer; pre-compiled binary
Cloudflare Workers0-5ms (V8 isolate spawn)V8 isolate per developers.cloudflare.com/workers
Vercel Edge Runtime5-30msV8 isolate (similar to Workers)
Vercel Node.js Functions200-500ms (small) to 2-3s (large)Lambda under the hood
Netlify Functions300ms-2sLambda under the hood

The "Workers / Edge" qualitative leap is the isolate model: each function is a V8 isolate, spun up in microseconds per developers.cloudflare.com - no container, no OS startup.

Mitigations

Provisioned concurrency (AWS Lambda)

Per docs.aws.amazon.com/lambda: keeps N execution environments pre-initialised. Eliminates cold starts up to N concurrent requests; you pay for the keep-warm time.

Trade-off: cost. A constant N=10 provisioned concurrency for 30 days ≈ $30-300 depending on memory class.

Lambda SnapStart (Java / .NET)

Per docs.aws.amazon.com/lambda/latest/dg/snapstart.html: takes a snapshot of the initialised JVM and restores from it on each cold start. Reduces Java cold starts from 1.5-6s → 100-300ms.

Caveats:

  • State snapshot includes connections, random seeds; can't have per-instance unique values frozen.
  • Hooks beforeCheckpoint / afterRestore let you re-prime non-serializable state.

Package-size discipline

Lambda cold-start scales with deployment-package size. Per AWS: keep under 50MB (zipped) → cold start in the 200-800ms range. Larger packages → seconds.

Avoid heavy module-level imports

Init code runs once but on every cold start. Heavy imports (database connection pool init, large dependency trees) inflate init time.

# Bad: top-level
import heavy_lib            # 2s import time
def handler(event, ctx):
    return heavy_lib.do_thing(event)

# Better: lazy-import
def handler(event, ctx):
    import heavy_lib
    return heavy_lib.do_thing(event)

Lazy imports add per-warm-call latency but reduce cold-start spike.

Runtime choice

Pre-compiled runtimes (Go, Rust) have far lower cold starts than managed runtimes (Java, Python). For latency-critical paths, runtime choice is a primary lever.

Testable behaviours

BehaviourTest
Cold start within budgetForce cold (deploy or wait > idle-evict time); measure p95 first-invocation latency
Warm performanceSubsequent invocations (50+) → p95 well within prod budget
SnapStart effectivePre/post SnapStart cold start delta
Provisioned concurrency keeps warmRun for an hour; no cold-start spikes observed
Package size in budgetBuild-step assertion: zipped artifact < 50MB
No heavy init-time importsProfile init phase; assert < 500ms

Anti-patterns

Anti-patternWhy it failsFix
p99 latency surpriseCold starts at the tail; not visible in p50/p95Watch p99; explicit cold-start monitoring (CloudWatch Init Duration metric)
Large dependency tree on init pathCold start inflated 2-5xAudit imports; lazy-import non-critical
Java on Lambda without SnapStart5s cold startsEnable SnapStart
Provisioned concurrency without size analysisPay for unused warm instancesTune to actual concurrency p99
Cold-start test only on the local dev environmentLocal doesn't simulate Lambda initDeploy + test against AWS / Workers / Edge
Treat cold starts as "rare"Bursty traffic → cold starts clusterAccount for both steady-state and burst patterns
Ignore module bundlingWebpack-bundled is smaller AND has fewer import resolution hopsBundle for production Lambdas

Limitations

  • Cold-start measurement is platform-side. CloudWatch Init Duration metric is canonical for Lambda; Workers / Edge expose their own.
  • SnapStart caveats are subtle. State that survives the snapshot may be wrong (random seeds, connection state).
  • Provisioned-concurrency is regional. Multi-region Lambdas need PC per region.
  • Workers / Edge are not free of all variance. First request per (script, region) still has 5-30ms init.
  • Doesn't address steady-state throughput. Cold start is one metric; concurrency-limit + duration are separate.

References