Free Tool

LLM Model Comparison Matrix

Compare 49+ LLM models side by side. Filter by use case, budget, or provider to find the best model for your project. Sort by price, context window, or provider.

Filter Models

Best for
0 models matching your filters
Model Provider Input $/M Output $/M Context Best For

Related Free Tools

Found the right model? Estimate your costs or generate the API code.

💰 Cost Calculator
Estimate monthly AI spend and find cheaper alternatives.
🔧 API Request Builder
Generate ready-to-copy code in cURL, Python, Node.js, and PHP.

Track costs across all these models automatically

Tokonomics works with every model listed here. One URL change and every call is metered with budget alerts.

Start Free →

Frequently Asked Questions

How do I choose the right LLM model?

Start with your use case. For customer-facing chatbots, prioritize quality (GPT-4o, Claude Sonnet). For high-volume tasks like classification or extraction, use a cheap model (GPT-4o-mini, DeepSeek Chat). For coding, Sonnet and GPT-4o lead benchmarks. Use the filters above to narrow down by budget and use case.

What does "context window" mean?

The context window is the maximum number of tokens a model can process in a single request (input + output combined). A 128K context window means roughly 96,000 words. Larger contexts let you process long documents but cost more per call. For RAG use cases, 128K+ is ideal.

Why are some models so much cheaper than others?

Smaller, distilled models (GPT-4o-mini, Haiku, DeepSeek Chat) trade some reasoning ability for 10-50x lower cost. For 80% of production use cases — classification, extraction, simple Q&A — they perform identically to premium models. See our cheapest LLM guide for benchmarks.

How often are prices updated?

We sync pricing from official provider documentation weekly. Last update includes all major providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Groq, xAI, and Cohere. Use our cost calculator for detailed monthly cost projections.