AI · LEADERBOARD

The AI Leaderboard

Daily ranking of the strongest AI models. No vendor spin, no marketing pages — just the picks teams actually ship with.

Refreshed14 Jun, 04:01 CESTDaily, 04:00 Europe/Amsterdam
Best closed model

Claude Fable 5 (with fallback)

Anthropic

Intel
64.9
Speed
72t/s
Cost in/out
€8.65 / €43.23

State-of-the-art reasoning and complex agentic workflows requiring maximum intelligence

Best open model
Ki

Kimi K2.6

Kimi

Intel
53.9
Speed
41t/s
Cost in/out
Provider€0.82 / €3.46Sovereign~€66 / hr*

Large-scale open-source reasoning with a modified MIT license

Best value260.5
MI

MiniMax-M3

MiniMax

Intel
54.7
Speed
57t/s
Cost in/out
€0.26 / €1.04

Highly economical long-context processing and value-tier reasoning

Top 10, side by side

#ModelVendorIntelligenceSpeed (tok/s)Cost (€/Mtok)ContextBest for
1
Claude Fable 5 (with fallback)
Anthropic64.972€8.65 / €43.231.0MState-of-the-art reasoning and complex agentic workflows requiring maximum intelligence
2
Claude Opus 4.8 (max)
Anthropic61.463€4.32 / €21.611.0MPremium enterprise reasoning and deep analysis with a massive context window
3
GPT-5.5 (xhigh)
OpenAI60.270€4.32 / €25.94922KHigh-performance proprietary reasoning and versatile multimodal tasks
4
GPT-5.5 (high)
OpenAI58.961€4.32 / €25.94922KAdvanced enterprise problem solving and high-speed logical reasoning
5
Claude Opus 4.7 (max)
Anthropic57.355€4.32 / €21.611.0MDeep document analysis and complex structured data extraction
6
Gemini 3.1 Pro Preview
Google57.2137€1.73 / €10.371.0MHigh-speed long-context analysis and multilingual translation tasks
7
GPT-5.5 (medium)
OpenAI56.763€4.32 / €25.94922KBalanced high-tier reasoning and efficient enterprise automation
8
Qwen3.7 Max
Alibaba56.6188€2.16 / €6.481.0MUltra-fast high-intelligence processing and multilingual tasks
9
Gemini 3.5 Flash
Google55.3217€1.3 / €7.781.0MBlazing fast customer support and high-volume chat applications
10
Gemini 3.5 Flash (medium)
Google54.8215€1.3 / €7.781.0MHigh-speed conversational AI and cost-effective text processing
#1 · proprietary

Claude Fable 5 (with fallback)

Anthropic

Intel
64.9
Speed
72 t/s
Cost
€8.65 / €43.23
Ctx
1.0M

State-of-the-art reasoning and complex agentic workflows requiring maximum intelligence

#2 · proprietary

Claude Opus 4.8 (max)

Anthropic

Intel
61.4
Speed
63 t/s
Cost
€4.32 / €21.61
Ctx
1.0M

Premium enterprise reasoning and deep analysis with a massive context window

#3 · proprietary

GPT-5.5 (xhigh)

OpenAI

Intel
60.2
Speed
70 t/s
Cost
€4.32 / €25.94
Ctx
922K

High-performance proprietary reasoning and versatile multimodal tasks

#4 · proprietary

GPT-5.5 (high)

OpenAI

Intel
58.9
Speed
61 t/s
Cost
€4.32 / €25.94
Ctx
922K

Advanced enterprise problem solving and high-speed logical reasoning

#5 · proprietary

Claude Opus 4.7 (max)

Anthropic

Intel
57.3
Speed
55 t/s
Cost
€4.32 / €21.61
Ctx
1.0M

Deep document analysis and complex structured data extraction

#6 · proprietary

Gemini 3.1 Pro Preview

Google

Intel
57.2
Speed
137 t/s
Cost
€1.73 / €10.37
Ctx
1.0M

High-speed long-context analysis and multilingual translation tasks

#7 · proprietary

GPT-5.5 (medium)

OpenAI

Intel
56.7
Speed
63 t/s
Cost
€4.32 / €25.94
Ctx
922K

Balanced high-tier reasoning and efficient enterprise automation

#8 · proprietary

Qwen3.7 Max

Alibaba

Intel
56.6
Speed
188 t/s
Cost
€2.16 / €6.48
Ctx
1.0M

Ultra-fast high-intelligence processing and multilingual tasks

#9 · proprietary

Gemini 3.5 Flash

Google

Intel
55.3
Speed
217 t/s
Cost
€1.3 / €7.78
Ctx
1.0M

Blazing fast customer support and high-volume chat applications

#10 · proprietary

Gemini 3.5 Flash (medium)

Google

Intel
54.8
Speed
215 t/s
Cost
€1.3 / €7.78
Ctx
1.0M

High-speed conversational AI and cost-effective text processing

Which model for which job

Nine scenarios SevenLab teams ship every week. Steal our picks.

  1. customer support

    • Blazing speed of 217 tok/s minimises customer wait times
    • Highly economical pricing at €1.3 per Mtok input and €7.78 output
    • Generous 1,000,000 token context window handles long chat histories
    Gemini 3.5 Flash
  2. code generation

    • Dedicated code-tuned architecture for software development tasks
    • Solid speed of 117 tok/s ensures rapid code generation
    • Cost-effective value tier pricing at €0.26 input and €1.04 output
    KAT-Coder-Pro V2
  3. document extraction

    • Massive 1,000,000 token context window processes entire books
    • High intelligence score of 57.28 ensures accurate data capture
    • Reliable proprietary architecture with closed weights security
    Claude Opus 4.7 (max)
  4. long context analysis

    • Expansive 1,000,000 token context window for massive datasets
    • Strong intelligence rating of 57.18 for complex reasoning
    • Fast processing speed of 137 tok/s accelerates deep analysis
    Gemini 3.1 Pro Preview
  5. on prem sovereign

    • Permissive Mit license allows flexible on-premises deployment
    • Highly efficient 42B parameter size reduces infrastructure costs
    • Strong open-source intelligence score of 53.83
    MiMo-V2.5-Pro
  6. multimodal vision

    • Industry-leading intelligence score of 64.88 for complex tasks
    • Massive 1,000,000 token context window for multi-image analysis
    • Robust proprietary model from Anthropic with fallback reliability
    Claude Fable 5 (with fallback)
  7. agentic automation

    • Peak intelligence score of 64.88 ensures reliable tool execution
    • Massive 1,000,000 token context window for long-running agents
    • Steady speed of 72 tok/s supports interactive agent loops
    Claude Fable 5 (with fallback)
  8. multilingual translation

    • Excellent intelligence score of 57.18 from Google's Pro tier
    • Fast translation speed of 137 tok/s for real-time localization
    • Massive 1,000,000 token context window handles entire books
    Gemini 3.1 Pro Preview
  9. structured extraction

    • Massive 1,000,000 token context window for large-scale parsing
    • Highly economical open-source pricing at €0.38 input and €0.75 output
    • Strong intelligence score of 51.51 ensures schema adherence
    DeepSeek V4 Pro (Max)

How we rank

  • Snapshot taken 6/14/2026, 2:01:38 AM
  • Value score = (intelligence − 25)² × min(1, speed ÷ 30) ÷ (input cost + output cost × 3). Output is weighted ×3 because real usage is output-heavy; intelligence is squared so cheap-but-weak models can't dominate.

Daily refresh at 04:00 Europe/Amsterdam. Numbers move; we don't smooth them.

We pick the model. Then we build the thing.

Building or buying with AI?

Talk directly with our AI specialists

15 min, no strings
No sales pressure
Prototype in 7 days