Real-time health of every CARTIE AI service. Auto-refreshes every 60 seconds.
Customers with mistyped Databricks workspace URLs saw a 500 from /api/databricks/cost-summary instead of a graceful empty state. Resolved by hardening _safe_get to handle non-JSON HTML responses (see post-mortem on the Databricks integration page). No customer data was exposed.
Identified — affected only credentials that returned non-JSON from Databricks. Engineering deploying patch.
Patch merged + deploying to all regions.
All endpoints back to 200. RCA published; regression test added.
Probes every 60s
Real HTTP checks against every upstream — not cached marketing.
SOC 2-aligned monitoring
Audit logs, anomaly detection, on-call escalation paths in place.
Multi-region failover
Database, AI providers, and outbound webhooks all redundant.
Get notified
We'll email you the moment a component is degraded — and again the moment it's back to operational.
THE FINOPS BRIEF
Built for finance & engineering teams who are tired of paying for cloud they don't use. No fluff. Just what works.
Unsubscribe anytime. We never sell your data.
Need help?
If something looks wrong here that isn't already reported, please reach out.