Change one URL.
Get a real FinOps system for AI.
Drop-in replacement for the OpenAI / Anthropic base URL. Same SDK, same payloads — plus per-team attribution, automatic budget cutoffs, full audit trail, and 70%-cheaper routing.
import openai
client = openai.OpenAI(
api_key="sk-...", # your OpenAI key
base_url="https://api.cartieai.com/v1", # ← one-line change
default_headers={
"X-Cartie-Key": "cartie_proxy_xxxx", # from app.cartieai.com
"X-Cartie-Tenant": "marketing-usa", # who's spending?
"X-Cartie-Project": "social-media-bot", # which project?
"X-Cartie-Budget": "500", # auto-block at $500/mo
},
)
# Same OpenAI payload — but 70% cheaper via Smart Router,
# every call logged + attributed + cost-capped.
resp = client.chat.completions.create(
model="gpt-5.2",
messages=[{"role": "user", "content": "Summarise this PDF in 3 bullets..."}],
)70% cheaper
Smart Router auto-picks the cheapest quality-matched provider per intent. Same prompt, fraction of the cost.
Budget cutoffs
Set X-Cartie-Budget. We block at the threshold, alert Slack, and route fallback intents to a cheaper tier.
Per-team P&L
X-Cartie-Tenant + X-Cartie-Project headers attribute every token to a cost center. Stripe-billable.
Audit trail
Immutable per-trace ledger. SOC 2 / ISO 27001-ready CSV export. No more "mystery bill".
- 01Forwards to OpenAI (or Anthropic, Gemini, Mistral, Meta) verbatim payload. Your provider key is never stored.
- 02Smart Router checks intent class — if a cheaper model would match quality, we re-route.
- 03Calculates cost in real time — from the latest pricing table per provider.
- 04Attributes to your tenant + project — per the X-Cartie-* headers.
- 05Logs to immutable ledger — visible in /admin/llm-proxy live console.
- 06Enforces the budget envelope — soft warning at 80%, hard block at 100%.
- 07Returns the response unchanged — your app sees the same OpenAI/Anthropic shape.
Live cost console
Every request streams into the /admin/llm-proxy console — model, tenant, tokens, cost, latency. Like tail -f for your AI bill.
Budget Negotiator (Sub-System Z)
Pre-sign budget envelopes with Finance once. After that every PR + every request stays inside the envelope automatically — soft warning at 80%, hard block at 100%.
How the envelope worksStop paying for the "mystery bill".
Get a proxy key in 60 seconds. No credit card. Same SDK. Real FinOps from your first request.