Pick any model, enter your expected token usage and daily call volume, and instantly see your projected costs — per call, daily, monthly, and yearly. Compare with cheaper alternatives below.
| Component | Tokens | Rate (per 1M) | Cost / Call |
|---|---|---|---|
| Input tokens | 1,000 | $0.00 | $0.00 |
| Output tokens | 500 | $0.00 | $0.00 |
| Total | 1,500 | $0.00 |
Models that can do the same job for less, sorted by savings.
| Model | Provider | Cost / Month | Savings |
|---|
Calculate costs here, then count tokens or compress prompts to save more.
This calculator uses official pricing from each provider as of June 2026. Actual costs may vary based on prompt caching discounts (up to 90% off with Anthropic), batch API pricing (50% off with OpenAI), and volume commitments. The estimates assume standard on-demand pricing with no discounts.
A rough rule: 1 token ≈ 4 English characters or ¾ of a word. A 500-word prompt is roughly 375 tokens. For precise counts, use our token counter tool. Output tokens depend on your use case: classification might produce 10 tokens, while content generation could produce 2,000+.
For text tasks (classification, summarization, Q&A), DeepSeek V4-Flash and Gemini 2.0 Flash offer the lowest per-token pricing. GPT-4o-mini and Claude Haiku 4.5 are strong mid-range options. The cheapest model depends on your quality requirements — see our cheapest LLM per use case guide.
Five proven strategies: (1) enable prompt caching for repeated instructions, (2) use smaller models for simple tasks, (3) compress prompts to reduce input tokens, (4) batch requests when latency isn't critical, (5) set hard budget caps to prevent runaway spending. Read our cost optimization guide for implementation details.