A note on this story: The numbers below are a composite of 6 Lambda cost audits we've run, on workloads ranging from 200K invocations/day (small SaaS backends) to 4.2B invocations/month (a streaming-data ingestion pipeline).
A founder DM'd us last month:
"Lambda is supposed to be the cheap option, right? Our serverless bill just hit $38K/month. What am I missing?"
What they were missing: 9 specific cost traps that compound.
We pulled their bill apart. Three functions accounted for 71% of the spend. All three were running 3072 MB of memory for workloads that fit in 512 MB. One had been recursively triggering itself for 11 days because of a bad S3 event filter. By the end of the audit we'd cut their Lambda bill from $38K to $11K/month — a $324K annualized save with no code refactor.
This is the playbook. Run it function-by-function on your top 10 highest-cost Lambdas.
How Lambda actually charges you
Two dimensions, billed every 1ms after a 1ms minimum:
Cost = (GB-seconds × $0.0000166667) + (Invocations × $0.0000002)
The first term is GB-seconds — the product of memory allocated × wall-clock duration. A function with 1024 MB memory running for 200ms costs:
1.0 GB × 0.2 s × $0.0000166667 = $0.00000333
That's a third of a cent per million invocations. So why is your bill $38K? Because the real-world Lambda cost equation looks more like this:
Real cost = (GB-s) + (invocations) + (provisioned concurrency) +
(CloudWatch logs ingestion) + (CloudWatch logs storage) +
(data transfer out) + (NAT gateway egress for VPC functions)
The non-Lambda lines often dominate. CloudWatch Logs alone can match Lambda compute on a chatty function. Let's break down each trap.
Trap 1: Memory wildly over-provisioned (the #1 waste)
The default Lambda memory is 128 MB. New developers panic at the first cold start latency, bump to 1024, then to 3008, then forget about it. Six months later 40% of their monthly bill is unused RAM.
The fix — AWS Lambda Power Tuning:
AWS Lambda Power Tuning is a Step Functions state machine that runs your function across 5-7 memory sizes and plots cost vs. latency. Free, takes 4 minutes to set up.
# Deploy the tuner once
git clone https://github.com/alexcasalboni/aws-lambda-power-tuning
cd aws-lambda-power-tuning && sam deploy --guided
# Then for each function:
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:...:lambda-power-tuning \
--input '{"lambdaARN":"arn:aws:lambda:us-east-1:...:function:my-fn","powerValues":[128,256,512,1024,1536,2048,3008],"num":50,"strategy":"cost"}'
The output gives you the cheapest memory size for your real workload. Median saving across 6 audits: 47% of compute spend.
Counter-intuitive truth: bumping memory often lowers total cost because you also get proportionally more CPU. A function at 1024 MB might cost less than the same function at 512 MB if it finishes 3× faster.
Trap 2: Provisioned concurrency you don't need
Provisioned Concurrency (PC) keeps Lambda environments warm to eliminate cold starts. It's billed at $0.0000041667/GB-s whether or not the function is invoked. That means 24/7 PC on a 1024 MB function costs ~$108/month per provisioned unit, doing nothing.
Three traps here:
- Set-and-forget PC at peak capacity. Most teams provision for peak QPS and never scale it down. Use Application Auto Scaling to scale PC by ScheduledScalingPolicy (drop to 1 unit during nights/weekends).
- PC on async workloads. Cold starts don't matter for SQS-triggered or EventBridge-scheduled functions. PC there is pure waste.
- PC + ARM mismatch. Provisioned Concurrency on x86_64 when you could be on Graviton (
arm64) costs 20% more for nothing.
Quick audit:
aws lambda list-provisioned-concurrency-configs --function-name $FN
# Multiply by hours/month × $0.0000041667 × MemoryMB / 1024
If you can't justify the cold-start latency to a real user-facing path, kill the PC.
Trap 3: Recursive / event-loop triggers (the silent budget killer)
This one always looks shocking in the audit:
A function listens to S3 ObjectCreated events. The function processes the file and writes a transformed version back to the same bucket. The transformed file triggers the function again. Infinite loop.
We've seen 4 variants of this:
- S3 → Lambda → S3 (same bucket)
- DynamoDB stream → Lambda → DynamoDB write (same table)
- EventBridge → Lambda → EventBridge put-events (same bus, same pattern)
- SNS → Lambda → SNS publish (same topic)
A single recursive trigger can rack up millions of invocations a day before anyone notices. The cost ramps slowly because each invocation is cheap, until the bill arrives.
The fix:
- Always scope event filters tightly — exclude the prefix or suffix the function writes to.
- Set a circuit-breaker metric: CloudWatch alarm on
Invocations > 1.5× last week's same-hour baseline.
- Use Lambda recursion detection (now built-in for some triggers): https://docs.aws.amazon.com/lambda/latest/dg/invocation-recursion.html
Also: check your DLQ (Dead Letter Queue). One customer had a poison message replaying through their function 12,000 times/hour for 3 weeks before anyone noticed. Set max-receive-count = 5 on every SQS source.
Trap 4: CloudWatch Logs ingestion (the hidden 30%)
CloudWatch Logs charges:
- $0.50 per GB ingested
- $0.03 per GB-month stored
If your Lambda logs every request body at INFO level, you can spend more on logs than on compute. We had a customer ingesting 1.2 TB/month of debug logs from a single function. That's $600/month — almost 2× the function's compute cost.
The fix:
- Set log retention to 14 days by default (it's "never expire" out of the box). Older = useless and expensive.
aws logs put-retention-policy --log-group-name /aws/lambda/my-fn --retention-in-days 14
- Move debug logs out of production. Use environment-aware log levels.
- Sample. If you're logging every request, log 1% of them with a sampling middleware.
- Don't log entire request payloads. PII risk + cost balloon.
Trap 5: ARM64 (Graviton) migration left undone
AWS Lambda on Graviton is 20% cheaper for the same memory AND typically 5-15% faster on Python/Node/Java workloads. The migration is a single Architectures: arm64 line in your SAM/CDK/Terraform.
Why most teams haven't migrated:
- Native binary deps need a rebuild (
pip install --platform=manylinux2014_aarch64 ...)
- Lambda layers need to be rebuilt for arm64
- Some image-processing libs (e.g., older Pillow versions) had ARM bugs
For 90%+ of pure-Python or Node functions: drop in, ship. Test in dev for a week, monitor errors, promote.
Across 6 audits, the median Lambda fleet was 7% ARM-migrated. That's leaving 12-15% of total Lambda spend on the table for free.
Trap 6: VPC NAT Gateway egress on Lambda
If your Lambda runs in a VPC (to talk to RDS, ElastiCache, an internal API) and needs internet access (for outbound API calls, S3 outside the VPC, etc.), it goes through a NAT Gateway.
NAT Gateway pricing:
- $0.045 per hour ($32.40/month)
- $0.045 per GB processed
Two traps:
- Many Lambdas, one NAT. If you have 12 Lambdas calling Stripe webhooks through a NAT Gateway, you're funneling all traffic through that one $32/month + per-GB. Use VPC endpoints for AWS services (S3, DynamoDB, Secrets Manager) — no NAT needed.
- Lambda outside VPC when possible. If a function only calls public APIs (Stripe, Slack, OpenAI), it doesn't need to be in a VPC. Take it out, save the NAT cost.
We saved one customer $2,800/month just by moving 4 outbound-only Lambdas out of their VPC and into the default Lambda networking model.
Trap 7: Synchronous invocations that should be async
# Bad: API Gateway → Lambda → SES (synchronous)
# Costs you 800ms × the user's wait
# Costs you Lambda time during SES API delay
# Good: API Gateway → Lambda (acks fast) → SQS → Lambda (sends email)
# First Lambda: 50ms
# Second Lambda runs async, can fail+retry
The async pattern halves duration, frees up the user-facing thread, and lets you batch (5-10 messages per Lambda invocation = fewer invocations, less cost).
Trap 8: API Gateway in front of every Lambda
Default pattern: API Gateway REST API → Lambda. AGW REST costs $3.50 per million requests + per-GB transferred.
For internal services, microservice-to-microservice calls, or webhooks under 100K req/day:
- Use Function URLs (free, built into Lambda)
- Use API Gateway HTTP APIs instead of REST ($1.00/million instead of $3.50)
- Use ALB + Lambda target for high-volume public APIs ($0.008/LCU + free tier)
For one customer, swapping from AGW REST to HTTP for their internal service mesh saved $8,400/month.
Trap 9: Concurrent execution limits hitting throttles
If your account hits the regional concurrency limit (default 1000), invocations throttle. Throttles look free but they're not — they cause:
- Failed user requests (revenue impact)
- Dead-letter queue fan-out (more invocations)
- Retry storms (3× the original cost)
Audit: check ConcurrentExecutions and Throttles CloudWatch metrics weekly. If Throttles > 0, request a limit increase or refactor noisy functions to use SQS batching.
The 30-minute self-audit
- Cost Explorer → group by Service → Lambda → Top 10 functions by cost
- For each top-10 function, in Lambda console:
- Memory: do you need it? (Run Power Tuning)
- Provisioned Concurrency: justify it or kill it
- Architecture: is it
arm64?
- CloudWatch → Logs → biggest log groups by ingested volume → reduce / sample
- Search Lambda triggers for recursion patterns (S3-writes-to-source-bucket)
Total time: 30 minutes. Typical first-pass savings: 30-50% of Lambda spend.
Real numbers from one audit (composite)
Customer: B2B fintech, ~80 Lambdas in prod, $38K/month bill before audit.
| Trap | Annual savings |
|---|
| Memory over-provisioning (47 functions tuned) | $158K |
| Recursive S3 trigger (one function) | $61K |
| CloudWatch log retention + sampling | $44K |
| Removed unused PC on 8 async functions | $38K |
| ARM migration (52 of 80 functions) | $14K |
| Removed AGW REST in front of 12 internal services | $7K |
| Total annual | $324K (-71%) |
How CARTIE AI helps
CARTIE AI's Lambda audit runs all 9 trap checks against your AWS account, ranks functions by potential savings, and gives you copy-paste fixes (the Power Tuning JSON, the retention-policy CLI, etc.). Typical first-scan: $3K–$25K/month projected savings.
Even without a tool, the 30-minute self-audit will find $500–$3K/month of waste in any Lambda fleet over $5K/month of spend.
Now go check your top function's memory setting. ⚡