Back to home
All guides
🧱
Databricks Lakehouse · FinOps Field Guide

Databricks Cost Optimization: The Complete Guide (2026)

Databricks bills explode quietly — Photon's 2x DBU markup, idle clusters at the 120-minute default, and Serverless's convenience premium combine into a stack that doubles bills in <12 months. This is the 10-pattern playbook to fix it.

42%
Average waste in Databricks
23% of DBUs
Idle minutes share
21 days
Median time-to-savings

The 10 patterns

01

Auto-terminate at 10 minutes

High impactLow effort15-30%

Default = 120 minutes. Drop to 10 via cluster policy. Single biggest no-regret savings lever in Databricks.

02

Audit Photon on I/O-bound jobs

High impactMedium effort5-20%

Photon = 2x DBUs. If your job is I/O-bound (>30% I/O wait), the 2x markup buys you no speedup. Diagnose with system.billing.usage joined to query metrics.

03

Cluster policies + size limits

High impactMedium effort15-25%

JSON policies that cap `node_type_id` to mid-size, force `auto_termination_minutes ≤ 30`, and require tags. Even your seniors can't override.

04

Migrate Serverless ELT → Classic

High impactHigh effort25-40%

Serverless wins for spiky analyst queries (Pattern 8 below). For 24/7 ELT, Classic is 25-40% cheaper. Audit `system.billing.usage WHERE sku LIKE '%SERVERLESS%'`.

05

Spot for non-prod compute

High impactMedium effort60-80%

Dev/staging clusters: `spot_bid_price_percent: -1`, `spot_fall_back: true`. Job clusters in prod: spot with on-demand fallback for the driver only.

06

Right-size with system tables

High impactMedium effort20-30%

`system.billing.usage` + `system.compute.clusters` reveal underutilized worker types. Most i3.xlarge clusters should be m5.large (no NVMe needed).

07

Photon ROI calculator (per job)

Medium impactMedium effortjob-by-job

For each scheduled job, run a week with Photon and a week without. Compare DBU/job. Disable Photon where ROI is negative.

08

Serverless SQL warehouses for analysts

High impactLow effort20-40%

Counter to Pattern 4: spiky analyst queries belong on Serverless. Cold-start penalty pays for itself within the first 10s of analysis time.

09

DBFS data lifecycle to cheap storage

Medium impactMedium effort30-60% on storage

Move historical data > 90 days from DBFS to S3 Glacier IR or Azure Archive. Use Unity Catalog external locations to keep query path stable.

10

Tag enforcement via cluster policy

High impactHigh effortEnables everything else

Cluster policy with `custom_tags` block requiring `team`, `cost_center`, `env`. Without per-cluster tags, you can't do showback.

Free audit

Run a free Databricks cost audit

CARTIE AI runs all 10 patterns against your workspace using `system.billing.usage`. Read-only PAT, no agent install.

Get the audit

THE FINOPS BRIEF

3 cost-saving tips, every Tuesday.

Built for finance & engineering teams who are tired of paying for cloud they don't use. No fluff. Just what works.

Unsubscribe anytime. We never sell your data.

We value your privacy. Cookies help us improve your experience. Learn more

Install CARTIE AI

Add to your home screen for quick access and offline support