Back to home
60-second quickstart

Change one URL.
Get a real FinOps system for AI.

Drop-in replacement for the OpenAI / Anthropic base URL. Same SDK, same payloads — plus per-team attribution, automatic budget cutoffs, full audit trail, and 70%-cheaper routing.

import openai

client = openai.OpenAI(
    api_key="sk-...",                        # your OpenAI key
    base_url="https://api.cartieai.com/v1",  # ← one-line change
    default_headers={
        "X-Cartie-Key":     "cartie_proxy_xxxx",       # from app.cartieai.com
        "X-Cartie-Tenant":  "marketing-usa",           # who's spending?
        "X-Cartie-Project": "social-media-bot",        # which project?
        "X-Cartie-Budget":  "500",                     # auto-block at $500/mo
    },
)

# Same OpenAI payload — but 70% cheaper via Smart Router,
# every call logged + attributed + cost-capped.
resp = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Summarise this PDF in 3 bullets..."}],
)

70% cheaper

Smart Router auto-picks the cheapest quality-matched provider per intent. Same prompt, fraction of the cost.

Budget cutoffs

Set X-Cartie-Budget. We block at the threshold, alert Slack, and route fallback intents to a cheaper tier.

Per-team P&L

X-Cartie-Tenant + X-Cartie-Project headers attribute every token to a cost center. Stripe-billable.

Audit trail

Immutable per-trace ledger. SOC 2 / ISO 27001-ready CSV export. No more "mystery bill".

What happens on your first request
  1. 01Forwards to OpenAI (or Anthropic, Gemini, Mistral, Meta) verbatim payload. Your provider key is never stored.
  2. 02Smart Router checks intent class — if a cheaper model would match quality, we re-route.
  3. 03Calculates cost in real time — from the latest pricing table per provider.
  4. 04Attributes to your tenant + project — per the X-Cartie-* headers.
  5. 05Logs to immutable ledger — visible in /admin/llm-proxy live console.
  6. 06Enforces the budget envelope — soft warning at 80%, hard block at 100%.
  7. 07Returns the response unchanged — your app sees the same OpenAI/Anthropic shape.

Live cost console

Every request streams into the /admin/llm-proxy console — model, tenant, tokens, cost, latency. Like tail -f for your AI bill.

Open the live console

Budget Negotiator (Sub-System Z)

Pre-sign budget envelopes with Finance once. After that every PR + every request stays inside the envelope automatically — soft warning at 80%, hard block at 100%.

How the envelope works
Free tier · 100k tokens/mo

Stop paying for the "mystery bill".

Get a proxy key in 60 seconds. No credit card. Same SDK. Real FinOps from your first request.

We value your privacy. Cookies help us improve your experience. Learn more

Install CARTIE AI

Add to your home screen for quick access and offline support