§ The Long FormBenchmarks · Model Survey
Best AI models April 2026, ranked.
A month into the GPT-5.5 / Claude Opus 4.7 / Gemini 3 era, the league table is volatile — and the right model now depends more on workload than vendor loyalty.
2 May 2026 · 11 min read · Kelford Press editorial
The frontier reshuffled twice in April. Anthropic shipped Opus 4.7, OpenAI moved GPT-5.5 Instant into the default slot, Google quietly rolled Gemini 3 to enterprise, and the open-weights camp closed the gap further than anyone expected. None of this means there is now one best model. It means the bar for 'best' has moved into workload-specific territory.
§ 01
Where each frontier model leads
On benchmark snapshots from MMMU-Pro and AIME 2025, GPT-5.5 Instant pulls ahead in math and structured reasoning, Claude Opus 4.7 leads on agentic tool-use harnesses and long-context retrieval, and Gemini 3 dominates anything involving live video, images, and grounded retrieval. Open weights — Kimi K2.6 and DeepSeek V4 — are now a credible third tier for self-hosted deployments, especially where data residency matters.
- Math + structured reasoning: GPT-5.5 Instant (AIME 2025 81.2 vs prior 65.4).
- Agentic tool-use + long context: Claude Opus 4.7 (Colossus 1 unlocked new rate limits).
- Multimodal + grounded retrieval: Gemini 3, with Gemini 4.0 incoming at I/O.
- Open weights: Kimi K2.6 + DeepSeek V4, viable for self-hosted production.
§ Subscriber-only · 2 sections ahead
The rest of this dive is for subscribers.
Kelford Press deep dives are sourced, fact-checked, and free to read in part. The full piece — including 2 remaining sections and the takeaways — unlocks for subscribers from $4.99/month.
No tracking pixels. One-click cancel. Newsletter stays free either way.
§ Further reading