A note on this story: Sizing examples below come from a composite of 6 production BigQuery accounts (data-team sizes 8–60, monthly spend $15K–$180K). Pricing values use 2026-Q2 list rates.
If your BigQuery bill is bigger than you expected, you're probably on on-demand pricing.
On-demand BigQuery costs $6.25 per TB scanned. That sounds cheap until you realise:
- A poorly-written query against a 5TB table costs $31.25 every time it runs
- A nightly ETL job hitting 10TB → $62.50/night = $1,875/month for one job
- An analyst running ad-hoc queries against your warehouse all day = $200–$500/day, easily
The fix is slot reservations — pre-paying for compute capacity instead of paying per-query. Done right, this cuts your BigQuery bill 40–70%. Done wrong, it costs MORE than on-demand. This guide is how to do it right.
On-demand vs Reservations: the 30-second math
| Pricing model | Unit | Best for |
|---|
| On-demand | $6.25 / TB scanned | Sporadic / unpredictable workloads |
| Flat-rate (legacy) | $2,000 / 100 slots / month | Stable, predictable workloads |
| Editions (Standard / Enterprise / Plus) | $0.04–$0.10 / slot-hour, autoscaling | Almost everyone |
Editions is what Google introduced in 2023 to replace flat-rate. You set a min/max slot count, BigQuery autoscales between them, and you only pay for what you use.
The cost question becomes: what min/max do I set?
The slot-sizing playbook
Here's the exact process we run with customers.
Step 1: Measure your actual slot usage
-- Run this in your project. Returns the avg + p95 slot usage per hour.
SELECT
TIMESTAMP_TRUNC(creation_time, HOUR) AS hour,
AVG(total_slot_ms) / 1000 / 3600 AS avg_slots_used,
APPROX_QUANTILES(total_slot_ms, 100)[OFFSET(95)] / 1000 / 3600 AS p95_slots
FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND CURRENT_TIMESTAMP()
GROUP BY hour
ORDER BY hour DESC;
You'll see something like:
- Avg slots used: 80
- P95 slots used: 240
- Peak slots used: 600 (during the nightly ETL window)
Step 2: Pick a min that covers steady-state
Set min_slots to roughly your median hourly usage — 80 slots in our example.
This is your floor. You'll always pay for these, even at 3am when nobody's running queries. So don't over-set the floor.
Step 3: Pick a max that covers your peaks
Set max_slots to roughly 1.2x your P95 hourly usage — ~290 slots in our example.
The max is your ceiling. Going higher gets queries done faster but you pay for it. Lower means queries queue.
Step 4: Schedule the ETL window separately
If your nightly ETL needs 600 slots for 2 hours, don't size your reservation for the peak. Instead:
- Default reservation: 80 → 290 slots (24 × 7)
- ETL reservation: 290 → 700 slots, only during the ETL window (e.g., 1am–4am)
You can have multiple reservations and assign them to specific projects / labels. Massive savings vs. paying for 700 slots 24/7.
The 4 common traps (each costs $1K+/month)
Trap 1: Setting max too high "just in case"
If you set max=2000 because "what if we need it someday", BigQuery will auto-scale up to 2000 the first time a complex query lands — and stay there for the cooldown window (usually 60 seconds). One careless analyst running a CROSS JOIN in your warehouse can burn $50–$200 in a single 90-second window.
The fix: keep max within 2x your actual P99. If a query needs more, queue it.
Trap 2: One reservation for everything
If your data team and your customer-facing analytics share one reservation, the data team's nightly ETL starves the customer dashboards. Then someone bumps min/max to fix it, and you're paying for capacity 24/7 that's only used 4 hours/day.
The fix: separate reservations per workload class:
production_dashboards (small min, small max, customer-facing)
adhoc_analytics (medium min, medium max, analyst queries)
etl_pipeline (zero min, large max, scheduled job class)
Trap 3: Forgetting to disable on-demand fallback
When your reservation is exhausted, BigQuery falls back to on-demand pricing for queries that overflow. If you're not careful, this turns "we capped slots at 200" into "yeah but we paid $4K of on-demand last month because someone's query couldn't wait."
The fix: set your project's pricing model to strict reservation (Admin → Reservations → Project Assignment → "Use only reserved slots"). Queries that overflow will queue, not bypass.
Trap 4: Buying Annual commitments without measuring
Annual reservations are 40% cheaper than monthly. They're also a 12-month commitment. If you over-buy by 100 slots for a year, that's ~$24K of unused capacity you can't cancel.
The fix: start with monthly commits for the first 3 months. Only convert to annual once your usage is stable and predictable.
Real-world before/after
A 40-engineer SaaS company we worked with:
| Metric | Before | After 60 days |
|---|
| Monthly BigQuery cost | $42K | $15K (-64%) |
| Pricing model | All on-demand | Editions w/ autoscaling |
| Min / max slots | n/a | 50 / 280 (default) + 0 / 600 (ETL) |
| P99 query latency | 4.2s | 4.4s (negligibly slower) |
| Cost per query | $0.42 avg | $0.16 avg |
The 0.2-second latency increase was the only "cost" of the change. Engineers didn't even notice.
The 5-minute starter audit
- Pull last 30 days slot usage with the SQL above. Note your avg + P95.
- Check your current pricing model. If on-demand and >$5K/month → you're a candidate for Editions.
- Calculate the break-even. ~$3K/month spend = breakeven for Standard edition.
- Try a 1-month commit at min=median, max=P95×1.2. Worst case you're back on on-demand next month.
- Set a monitoring alert for daily reservation utilisation (Cloud Monitoring → BigQuery →
bigquery.googleapis.com/slots/allocated_for_reservation).
How CARTIE AI helps
CARTIE AI's BigQuery slot tuner ingests your project's INFORMATION_SCHEMA.JOBS data, computes the optimal min/max for every workload class, and shows you the dollar projection per scenario. Typical first-scan: 50–65% cost reduction projected.
But honestly? Even without a tool — run the SQL query above. Pick a reservation tier. Try it for a month. The math is straightforward; the trap is not running the math at all.
Now go meter your slots. 🥃