AI Model Comparison Table
| Model ↕ | Provider ↕ | Context ↕ | Max Output ↕ | Input $/1M ↕ | Output $/1M ↕ | MMLU ↕ | HumanEval ↕ | Released ↓ |
|---|---|---|---|---|---|---|---|---|
| GPT-5.4 | OpenAI | 1.0M | 128K | $2.50 | $15.00 | — | — | 2026-03 |
| GPT-5.4 Pro | OpenAI | 1.0M | 128K | $30.00 | $180.00 | — | — | 2026-03 |
| Gemini 3.1 Flash-Lite | 1.0M | 66K | $0.25 | $1.50 | — | — | 2026-03 | |
| GPT-5.3-Codex | OpenAI | 400K | 128K | $1.75 | $14.00 | — | — | 2026-02 |
| Claude Opus 4.6 | Anthropic | 200K | 128K | $5.00 | $25.00 | — | — | 2026-02 |
| Gemini 3.1 Pro | 1.0M | 66K | $2.00 | $12.00 | — | — | 2026-02 | |
| Claude Sonnet 4.6 | Anthropic | 200K | 64K | $3.00 | $15.00 | — | — | 2026-01 |
| Gemini 3 Flash | 1.0M | 66K | $0.50 | $3.00 | — | — | 2026-01 | |
| Grok 4.1 Fast | xAI | 2.0M | 33K | $0.20 | $0.50 | — | — | 2026-01 |
| DeepSeek Chat (V3.2) | DeepSeek | 128K | 8K | $0.28 | $0.42 | — | — | 2025-12 |
| DeepSeek Reasoner (V3.2) | DeepSeek | 128K | 64K | $0.28 | $0.42 | — | — | 2025-12 |
| Mistral Large 3 | Mistral | 262K | 33K | $0.50 | $1.50 | — | — | 2025-12 |
| Claude Haiku 4.5 | Anthropic | 200K | 64K | $1.00 | $5.00 | — | 88.1 | 2025-10 |
| GPT-5-mini | OpenAI | 200K | 33K | $0.25 | $2.00 | — | — | 2025-09 |
| GPT-5-nano | OpenAI | 200K | 33K | $0.05 | $0.40 | — | — | 2025-09 |
| Claude Sonnet 4.5 | Anthropic | 200K | 64K | $3.00 | $15.00 | — | 93.0 | 2025-09 |
| Gemini 2.5 Flash | 1.0M | 66K | $0.30 | $2.50 | — | — | 2025-09 | |
| Grok 4 | xAI | 256K | 33K | $3.00 | $15.00 | — | — | 2025-09 |
| Grok 4 Fast | xAI | 2.0M | 33K | $0.20 | $0.50 | — | — | 2025-09 |
| Mistral Medium 3.1 | Mistral | 128K | 33K | $0.40 | $2.00 | — | — | 2025-08 |
| o3 | OpenAI | 200K | 100K | $2.00 | $8.00 | — | — | 2025-06 |
| Mistral Small 3.2 | Mistral | 128K | 8K | $0.10 | $0.30 | — | — | 2025-06 |
| GPT-4.1 | OpenAI | 1.0M | 33K | $2.00 | $8.00 | — | — | 2025-04 |
| GPT-4.1-mini | OpenAI | 1.0M | 33K | $0.40 | $1.60 | — | — | 2025-04 |
| GPT-4.1-nano | OpenAI | 1.0M | 33K | $0.10 | $0.40 | — | — | 2025-04 |
| Llama 4 Maverick (Groq) | Meta/Groq | 131K | 8K | $0.20 | $0.60 | — | — | 2025-04 |
| Llama 4 Scout (Groq) | Meta/Groq | 131K | 8K | $0.11 | $0.34 | — | — | 2025-04 |
| Gemini 2.5 Pro | 1.0M | 66K | $1.25 | $10.00 | — | — | 2025-03 |
GPT-5.4
OpenAIGPT-5.4 Pro
OpenAIGemini 3.1 Flash-Lite
GoogleGPT-5.3-Codex
OpenAIClaude Opus 4.6
AnthropicGemini 3.1 Pro
GoogleClaude Sonnet 4.6
AnthropicGemini 3 Flash
GoogleGrok 4.1 Fast
xAIDeepSeek Chat (V3.2)
DeepSeekDeepSeek Reasoner (V3.2)
DeepSeekMistral Large 3
MistralClaude Haiku 4.5
AnthropicGPT-5-mini
OpenAIGPT-5-nano
OpenAIClaude Sonnet 4.5
AnthropicGemini 2.5 Flash
GoogleGrok 4
xAIGrok 4 Fast
xAIMistral Medium 3.1
Mistralo3
OpenAIMistral Small 3.2
MistralGPT-4.1
OpenAIGPT-4.1-mini
OpenAIGPT-4.1-nano
OpenAILlama 4 Maverick (Groq)
Meta/GroqLlama 4 Scout (Groq)
Meta/GroqGemini 2.5 Pro
GoogleWhat This Tool Does
AI Model Comparison Table is built for deterministic developer and agent workflows.
Compare 35+ AI models side by side: pricing, context windows, and specs across OpenAI GPT, Claude, Gemini, Grok, DeepSeek, Llama, and more.
Use How to Use for execution steps and FAQ for constraints, policies, and edge cases.
Last updated:
This tool is provided as-is for convenience. Output should be verified before use in any production or critical context.
Agent Invocation
Best Path For Builders
Browser workflow
Runs instantly in the browser with private local processing and copy/export-ready output.
Browser Workflow
This tool is optimized for instant in-browser execution with local data handling. Run it here and copy/export the output directly.
/ai-model-comparison/
For automation planning, fetch the canonical contract at /api/tool/ai-model-comparison.json.
How to Use AI Model Comparison Table
- 1
Add models to compare
Search for models by name (GPT, Claude, Gemini, Llama) or select from the curated list. You can add up to 10 models at once for comparison.
- 2
View pricing details
Check input/output token costs (per 1M tokens), see pricing tiers for different volume levels, and identify free tier models if cost is a priority.
- 3
Compare specs side-by-side
Filter by context window size, max output tokens, vision capabilities, and training data cutoff. Use these metrics to find the model that fits your use case.
- 4
Check benchmarks and performance
Review MMLU, GSM8K, and coding benchmark scores to see which model performs best on your specific task type (math, reasoning, code generation).
- 5
Export and save your comparison
Download the comparison as a CSV or PDF to share with your team or reference later when making model decisions.
Frequently Asked Questions
What is AI Model Comparison?
How do I use AI Model Comparison?
Is AI Model Comparison free?
Does AI Model Comparison store or send my data?
How often is the model data updated?
AI Model Specifications — April 12, 2026
28 current models and 13 legacy models compared. Context windows, output limits, pricing, and release dates.
Current Models
| Model | Provider | Context | Max Output | Input/1M | Output/1M | Released |
|---|---|---|---|---|---|---|
| GPT-5.4 | OpenAI | 1.0M | 128K | $2.50 | $15.00 | 2026-03 |
| GPT-5.4 Pro | OpenAI | 1.0M | 128K | $30.00 | $180.00 | 2026-03 |
| GPT-5.3-Codex | OpenAI | 400K | 128K | $1.75 | $14.00 | 2026-02 |
| GPT-5-mini | OpenAI | 200K | 33K | $0.25 | $2.00 | 2025-09 |
| GPT-5-nano | OpenAI | 200K | 33K | $0.050 | $0.40 | 2025-09 |
| o3 | OpenAI | 200K | 100K | $2.00 | $8.00 | 2025-06 |
| GPT-4.1 | OpenAI | 1.0M | 33K | $2.00 | $8.00 | 2025-04 |
| GPT-4.1-mini | OpenAI | 1.0M | 33K | $0.40 | $1.60 | 2025-04 |
| GPT-4.1-nano | OpenAI | 1.0M | 33K | $0.10 | $0.40 | 2025-04 |
| Claude Opus 4.6 | Anthropic | 200K | 128K | $5.00 | $25.00 | 2026-02 |
| Claude Sonnet 4.6 | Anthropic | 200K | 64K | $3.00 | $15.00 | 2026-01 |
| Claude Sonnet 4.5 | Anthropic | 200K | 64K | $3.00 | $15.00 | 2025-09 |
| Claude Haiku 4.5 | Anthropic | 200K | 64K | $1.00 | $5.00 | 2025-10 |
| Gemini 3.1 Pro | 1.0M | 66K | $2.00 | $12.00 | 2026-02 | |
| Gemini 3 Flash | 1.0M | 66K | $0.50 | $3.00 | 2026-01 | |
| Gemini 3.1 Flash-Lite | 1.0M | 66K | $0.25 | $1.50 | 2026-03 | |
| Gemini 2.5 Pro | 1.0M | 66K | $1.25 | $10.00 | 2025-03 | |
| Gemini 2.5 Flash | 1.0M | 66K | $0.30 | $2.50 | 2025-09 | |
| DeepSeek Chat (V3.2) | DeepSeek | 128K | 8K | $0.28 | $0.42 | 2025-12 |
| DeepSeek Reasoner (V3.2) | DeepSeek | 128K | 64K | $0.28 | $0.42 | 2025-12 |
| Mistral Large 3 | Mistral | 262K | 33K | $0.50 | $1.50 | 2025-12 |
| Mistral Medium 3.1 | Mistral | 128K | 33K | $0.40 | $2.00 | 2025-08 |
| Mistral Small 3.2 | Mistral | 128K | 8K | $0.10 | $0.30 | 2025-06 |
| Llama 4 Maverick (Groq) | Meta/Groq | 131K | 8K | $0.20 | $0.60 | 2025-04 |
| Llama 4 Scout (Groq) | Meta/Groq | 131K | 8K | $0.11 | $0.34 | 2025-04 |
| Grok 4 | xAI | 256K | 33K | $3.00 | $15.00 | 2025-09 |
| Grok 4.1 Fast | xAI | 2.0M | 33K | $0.20 | $0.50 | 2026-01 |
| Grok 4 Fast | xAI | 2.0M | 33K | $0.20 | $0.50 | 2025-09 |
Legacy Models (13) — still available in APIs
| Model | Provider | Context | Input/1M | Output/1M | Released |
|---|---|---|---|---|---|
| GPT-5.2 | OpenAI | 400K | $1.75 | $14.00 | 2026-01 |
| GPT-5.1 | OpenAI | 400K | $1.25 | $10.00 | 2025-10 |
| GPT-5 | OpenAI | 400K | $1.25 | $10.00 | 2025-08 |
| o3-pro | OpenAI | 200K | $20.00 | $80.00 | 2025-07 |
| o4-mini | OpenAI | 200K | $1.10 | $4.40 | 2025-07 |
| o3-mini | OpenAI | 200K | $1.10 | $4.40 | 2025-01 |
| GPT-4o | OpenAI | 128K | $2.50 | $10.00 | 2024-05 |
| GPT-4o-mini | OpenAI | 128K | $0.15 | $0.60 | 2024-07 |
| Claude Opus 4.5 | Anthropic | 200K | $5.00 | $25.00 | 2025-11 |
| Claude Sonnet 4 | Anthropic | 200K | $3.00 | $15.00 | 2025-05 |
| Gemini 2.0 Flash | 1.0M | $0.10 | $0.40 | 2025-02 | |
| Mistral Large 2 | Mistral | 128K | $2.00 | $6.00 | 2024-07 |
| Llama 3.3 70B | Meta/Groq | 131K | $0.59 | $0.79 | 2024-12 |
Market Trends — April 12, 2026
- Competition across OpenAI, Anthropic, Google, xAI, DeepSeek, and open-weight ecosystems continues to compress pricing.
- High-context tiers are becoming standard across major providers, improving long-document and agent workflows.
- Fast and budget model tiers are increasingly viable for high-volume production use cases.
- Prompt caching and retrieval-aware workflows are now key levers for reducing effective per-request cost.
- Model selection is moving from single-model strategies to portfolio-based routing by task, latency, and budget.
Key Comparisons
OpenAI GPT vs Claude Flagships — Compare pricing, max output, and context windows for top-tier reasoning and coding performance.
Claude Balanced Tier vs GPT Balanced Tier — Useful when you need strong quality without paying flagship rates.
Gemini vs GPT Families — Often strong on context capacity and multimodal workflows; compare against your latency and cost targets.
DeepSeek vs Open-Weight Llama/Mistral Options — Good comparison set for budget-sensitive production pipelines.
Fast/Low-Cost Tiers Across Providers — Compare context limits and output pricing before committing to high-volume jobs.