Skip to content

Kelford Press

Signal from the noise

§ The Long FormBenchmarks · Model Survey

Best AI models April 2026, ranked.

A month into the GPT-5.5 / Claude Opus 4.7 / Gemini 3 era, the league table is volatile — and the right model now depends more on workload than vendor loyalty.

2 May 2026 · 11 min read · Kelford Press editorial

The frontier reshuffled twice in April. Anthropic shipped Opus 4.7, OpenAI moved GPT-5.5 Instant into the default slot, Google quietly rolled Gemini 3 to enterprise, and the open-weights camp closed the gap further than anyone expected. None of this means there is now one best model. It means the bar for 'best' has moved into workload-specific territory.

§ 01

Where each frontier model leads

On benchmark snapshots from MMMU-Pro and AIME 2025, GPT-5.5 Instant pulls ahead in math and structured reasoning, Claude Opus 4.7 leads on agentic tool-use harnesses and long-context retrieval, and Gemini 3 dominates anything involving live video, images, and grounded retrieval. Open weights — Kimi K2.6 and DeepSeek V4 — are now a credible third tier for self-hosted deployments, especially where data residency matters.

  • Math + structured reasoning: GPT-5.5 Instant (AIME 2025 81.2 vs prior 65.4).
  • Agentic tool-use + long context: Claude Opus 4.7 (Colossus 1 unlocked new rate limits).
  • Multimodal + grounded retrieval: Gemini 3, with Gemini 4.0 incoming at I/O.
  • Open weights: Kimi K2.6 + DeepSeek V4, viable for self-hosted production.

§ Subscriber-only · 2 sections ahead

The rest of this dive is for subscribers.

Kelford Press deep dives are sourced, fact-checked, and free to read in part. The full piece — including 2 remaining sections and the takeaways — unlocks for subscribers from $4.99/month.

No tracking pixels. One-click cancel. Newsletter stays free either way.

§ Further reading

  1. 01GPT-5.4 vs Gemini 3.1 Pro (2026): which AI wins?
  2. 02What is Claude Cowork? The 2026 guide