Compare 49+ LLM models side by side. Filter by use case, budget, or provider to find the best model for your project. Sort by price, context window, or provider.
| Model | Provider | Input $/M ▼ | Output $/M | Context | Best For |
|---|
Found the right model? Estimate your costs or generate the API code.
Start with your use case. For customer-facing chatbots, prioritize quality (GPT-4o, Claude Sonnet). For high-volume tasks like classification or extraction, use a cheap model (GPT-4o-mini, DeepSeek Chat). For coding, Sonnet and GPT-4o lead benchmarks. Use the filters above to narrow down by budget and use case.
The context window is the maximum number of tokens a model can process in a single request (input + output combined). A 128K context window means roughly 96,000 words. Larger contexts let you process long documents but cost more per call. For RAG use cases, 128K+ is ideal.
Smaller, distilled models (GPT-4o-mini, Haiku, DeepSeek Chat) trade some reasoning ability for 10-50x lower cost. For 80% of production use cases — classification, extraction, simple Q&A — they perform identically to premium models. See our cheapest LLM guide for benchmarks.
We sync pricing from official provider documentation weekly. Last update includes all major providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Groq, xAI, and Cohere. Use our cost calculator for detailed monthly cost projections.