Last verified 2026-02-13

LLM Token Pricing Index

Every major model, side by side. Free forever. Powered by CARTIE AI.

ProviderModel$/1M in$/1M outContextTierNotes
AWS
Amazon Nova Lite
amazon-nova-lite
$0.060$0.240300kCheapBedrock. One of the cheapest multimodal models.
Google
Gemini 2.0 Flash
gemini-2.0-flash
$0.075$0.3001MCheapCheapest 1M-context option on the market.
Google
Gemini 3 Flash
gemini-3-flash
$0.100$0.4001MCheapInsanely cheap for 1M context.
OpenAI
GPT-4o mini
gpt-4o-mini
$0.150$0.600128kCheapBest price/quality for high-volume.
Cohere
Command R
command-r
$0.150$0.600128kCheapCheap RAG-tuned model.
Mistral
Mistral Small 3
mistral-small-3
$0.200$0.60032kCheapApache 2.0 license, self-hostable.
DeepSeek
DeepSeek V3
deepseek-v3
$0.270$1.1064kBalancedStrong code/math. Cache-hit pricing as low as $0.07.
OpenAI
GPT-5.2 mini
gpt-5.2-mini
$0.300$2.40400kBalancedCheapest GPT-5 family.
xAI
Grok 3 mini
grok-3-mini
$0.300$0.500131kCheapFastest in xAI family.
DeepSeek
DeepSeek R1
deepseek-r1
$0.550$2.1964kReasoningOpen reasoning model. ~30× cheaper than o1.
Meta
Llama 3.3 70B
llama-3.3-70b
$0.590$0.790128kBalancedOpen weights. Pricing shown is Together AI; self-host = free.
Anthropic
Claude Haiku 4.5
claude-haiku-4-5
$0.800$4.00200kCheapAnthropic's fastest. Code completion sweet spot.
AWS
Amazon Nova Pro
amazon-nova-pro
$0.800$3.20300kBalancedBedrock-native. Best for AWS-shop integration.
OpenAI
GPT-5.2
gpt-5.2
$1.75$14.00400kFrontierHighest reasoning tier. Cache-friendly.
Google
Gemini 3 Pro
gemini-3-pro
$2.00$12.001MFrontierMassive 1M context window.
Mistral
Mistral Large 3
mistral-large-3
$2.00$6.00128kBalancedStrong European option. EU data residency.
OpenAI
GPT-4o
gpt-4o
$2.50$10.00128kBalancedWorkhorse multimodal model.
Cohere
Command R+
command-r-plus
$2.50$10.00128kBalancedOptimized for RAG and tool use.
OpenAI
o1-mini
o1-mini
$3.00$12.00128kReasoningReasoning at 5× cheaper than o1.
Anthropic
Claude Sonnet 4.5
claude-sonnet-4-5
$3.00$15.00200kBalancedExcellent for code + writing. Strong cache pricing.
Meta
Llama 3.3 405B
llama-3.3-405b
$3.50$3.50128kFrontierOpen weights. Pricing via Fireworks/Together.
xAI
Grok 3
grok-3
$5.00$15.00131kFrontierX-integrated. Real-time web context.
OpenAI
o1 (reasoning)
o1
$15.00$60.00200kReasoningChain-of-thought reasoning, expensive per output token.
Anthropic
Claude Opus 4.5
claude-opus-4-5
$15.00$75.00200kFrontierAnthropic's largest. Best for complex reasoning.

Cheapest per tier

Best dollar-for-dollar pick in each quality band (blended in/out).

Frontier
Llama 3.3 405B
Meta
$3.50/ 1M in
$3.50/ 1M out
Reasoning
DeepSeek R1
DeepSeek
$0.550/ 1M in
$2.19/ 1M out
Balanced
DeepSeek V3
DeepSeek
$0.270/ 1M in
$1.10/ 1M out
Cheap
Amazon Nova Lite
AWS
$0.060/ 1M in
$0.240/ 1M out

Cost Calculator

Estimate your monthly bill across every major model.

Spread

Cheapest Amazon Nova Lite at $14.40/mo · most expensive Claude Opus 4.5 at $4.2K/mo · spread is $4.2K/mo.

1.AWSAmazon Nova Lite$14.40
2.GoogleGemini 2.0 Flash$18.00
3.GoogleGemini 3 Flash$24.00
4.OpenAIGPT-4o mini$36.00
5.CohereCommand R$36.00
6.MistralMistral Small 3$40.00
7.xAIGrok 3 mini$44.00
8.DeepSeekDeepSeek V3$65.60

Math = (requests × input_tokens / 1M × $input) + (requests × output_tokens / 1M × $output). Excludes cache discounts, fine-tuning, multimodal, and committed-use rates.

Hidden costs nobody talks about

Provider pricing pages don't advertise these — but they'll wreck your monthly bill.

Cache pricing diff

Anthropic & OpenAI offer 90% discount on cache READS but 25% premium on cache WRITES. Get this wrong and you pay 3× more.

Fine-tuning surcharge

Fine-tuned models cost 4-6× the base model per token. Sometimes a smaller base + good prompts is cheaper.

Image/audio token math

A single image can be 1,000-12,000 tokens depending on detail tier. Audio billed per second, not per token.

Context-window amplifier

Filling a 1M-token context with Gemini = $2 per request even with cheap pricing. Use the smallest window you can.

Free, citable JSON API

Embed this in your tools. No key required. Updated weekly.

GET https://demo-cartie.emergent.host/api/v1/prices
{
  "powered_by": "CARTIE AI",
  "docs": "https://cartieai.com/tokens",
  "last_updated": "2026-02-13",
  "currency": "USD",
  "unit": "per_1m_tokens",
  "models": [
    { "provider": "OpenAI", "model": "gpt-5.2", "input": 1.75, "output": 14.00, "context_k": 400, "tier": "frontier", ... },
    ...
  ]
}

Please cache responses for ≥1 hour. Cite CARTIE AI if you build with this data. Refreshed weekly.

Want this math on your actual usage?

CARTIE plugs into your LLM proxy and shows the same numbers — line by line, route by route, with model-swap recommendations.

We value your privacy. Cookies help us improve your experience. Learn more

Install CARTIE AI

Add to your home screen for quick access and offline support