§.FAQ
Which LLM model should I pick?
Decision matrix: speed vs quality vs cost for each supported model.
Updated 2026-04-13 · By Jon Lasley
| Use case | Top pick | Runner-up |
|---|---|---|
| Complex reasoning, long context, highest quality | Claude Opus 4.7 | GPT-5.5 or Claude Opus 4.6 |
| Most general-purpose prompt work | Claude Sonnet 4.6 | GPT-4.1 or GPT-5.4 |
| Fast, cheap, high volume | Claude Haiku 4.5 | GPT-4.1 Nano or Gemini 2.5 Flash-Lite |
| Multi-modal (images, audio) | Gemini 2.5 Pro | Claude Sonnet 4.6 (images only) |
| Cost-optimized inference at scale | Gemini 2.5 Flash-Lite | GPT-4.1 Nano |
| Hardest reasoning problems | GPT-5.5 (or GPT-5.5 Pro) | Claude Opus 4.7 or Gemini 2.5 Pro |
| Reasoning at low cost | GPT-5.4 Mini | GPT-5.4 Nano |
| Coding-task prompts | GPT-5.3 Codex | GPT-5.4 or Claude Opus 4.7 |
| Latency-critical | Claude Haiku 4.5 | GPT-4.1 Mini or Gemini 2.5 Flash-Lite |
Experiment in the playground
Two paths in the playground: switch the model dropdown and re-run sequentially (each run recorded in test-run history with latency / tokens / cost), or click Compare models to run the same prompt against 2–5 models in parallel and see outputs, costs, and latencies side-by-side. With a judge step you can also score every model's output against a shared rubric and surface the best per criterion. See Comparing models in the playground.