LLMs 101

Frontier model directory

Every major AI model,
actually explained

No benchmark scores. No jargon. Just what each model family is genuinely good at, where it falls short, and who should use it.

Updated June 2026
Filter by
OpenAI Market leader
GPT & o-Series
GPT-4o · GPT-4.1 · o3 · o4-mini
Core superpower
Versatile excellence across almost every task — the safe, proven default for business use
Key trade-off
Premium cost at scale; o3 is slow and expensive for heavy reasoning tasks
Speed
8/10
Reasoning
9.5/10
Cost
Standard
Best non-technical use
Complex business analysis, structured data extraction, advanced code debugging, and multi-step logic tasks
Cost tier
Standard

OpenAI's GPT-4o and o-series models are the most widely deployed large language models in the world. GPT-4o provides fast, multimodal responses combining text, image, and audio understanding. The o3 and o4-mini reasoning models use extended chain-of-thought processing to achieve near-expert performance on mathematics, coding, and complex logic tasks. GPT-4.1 is optimised for long-context coding and instruction following. Available via the OpenAI API and ChatGPT.

Anthropic Dominant in writing
Claude
Claude 4 Opus · Claude 4 Sonnet · Claude Haiku
Core superpower
Elite long-form writing, nuanced reasoning, and complex multi-step document workflows
Key trade-off
Opus tier is premium-priced; occasionally over-cautious on edge-case content requests
Speed
7.5/10
Reasoning
9/10
Cost
Premium
Best non-technical use
Long-form analysis and editing, nuanced writing, agentic document workflows, and tasks requiring careful, principled reasoning
Cost tier
Premium

Anthropic's Claude models are built with Constitutional AI, a safety training approach that makes Claude unusually reliable and honest. Claude 4 Opus offers the highest capability for complex reasoning and writing tasks. Claude 4 Sonnet balances capability and cost for everyday professional use. Claude Haiku is optimised for speed and cost efficiency. Claude models support a 200,000 token context window — large enough to process entire books or codebases in a single conversation.

Google DeepMind Highly competitive
Gemini
Gemini 2.5 Pro · Gemini 2.0 Flash · Gemini Ultra
Core superpower
Handling enormous inputs — full video files, 1M-token codebases, entire document archives at once
Key trade-off
Personality and writing style feel less refined than Claude; Pro tier pricing has caught up with competitors
Speed
8.8/10
Reasoning
8.8/10
Cost
Value
Best non-technical use
Analysing massive uploads — video files, lengthy research PDFs, entire codebases — and anything deeply integrated with Google Workspace
Cost tier
Value

Google DeepMind's Gemini models are natively multimodal, processing text, images, audio, and video as equal first-class inputs. Gemini 2.5 Pro features the largest publicly available context window — 1 million tokens — enabling analysis of entire large codebases or lengthy video content. Gemini 2.0 Flash is optimised for speed and cost efficiency in production applications. Gemini powers Google Search AI Overviews and is deeply integrated into Google Workspace products including Docs, Gmail, and Sheets.

Meta AI Open weight leader
Llama
Llama 3.3 70B · Llama 3.1 405B · Llama 3.2
Core superpower
Fully open weights — download and run privately on your own hardware, free forever with no API costs
Key trade-off
Requires technical setup to run locally; cloud versions via third parties still incur cost
Speed
8/10
Reasoning
8.5/10
Cost
Free*
Best non-technical use
Privacy-sensitive workflows where data cannot leave your machine, high-volume automation where API costs would otherwise be prohibitive
Cost tier
Free (self-hosted)

Meta's Llama 3 series are the most capable openly available large language models. Llama 3.3 70B delivers performance competitive with proprietary frontier models at zero licensing cost. Llama 3.1 405B approaches GPT-4 class performance and is available for local deployment. The open weights enable the Ollama ecosystem, allowing anyone to run AI locally on consumer hardware including Apple Silicon Macs. Llama models can be fine-tuned on private data without exposing that data to any third party.

DeepSeek Market disruptor
DeepSeek
DeepSeek V3 · DeepSeek R1 · DeepSeek-Coder
Core superpower
o1-level reasoning performance at a fraction of the cost — the most significant price disruption in AI history
Key trade-off
Chinese-operated; some content restrictions; data privacy considerations for sensitive enterprise use
Speed
7/10
Reasoning
9/10
Cost
Ultra-low
Best non-technical use
High-volume automation where cost-per-prompt needs to be near zero; advanced coding tasks where quality needs to match OpenAI at 10x lower cost
Cost tier
Ultra-low cost

DeepSeek's January 2025 release of R1 shocked the AI industry by matching OpenAI o1's reasoning benchmark performance at approximately $6 million training cost versus hundreds of millions. DeepSeek V3 is a 671-billion parameter Mixture of Experts model that activates only 37 billion parameters per token, achieving frontier performance at dramatically lower inference cost. DeepSeek R1 uses chain-of-thought reasoning and is available as open weights. DeepSeek-Coder is specialised for software development tasks.

Mistral AI European challenger
Mistral
Mistral Large · Mixtral 8x7B · Mistral 7B
Core superpower
Punches far above its weight — small efficient models that outperform much larger competitors
Key trade-off
Smaller company means slower release cadence; less ecosystem tooling than OpenAI or Meta
Speed
9/10
Reasoning
8/10
Cost
Low
Best non-technical use
Fast, cost-efficient production applications where European data sovereignty and open weights matter; real-time summarisation and classification at scale
Cost tier
Low cost

Mistral AI, founded in France in 2023, created a stir with Mistral 7B which outperformed much larger models at launch. Mixtral 8x7B is an open-weight Mixture of Experts model with 46.7 billion total parameters activating only 12.9 billion per token — delivering strong performance at low inference cost. Mistral Large competes with GPT-4 class models. Mistral is a strong advocate for open-source AI and European digital sovereignty. Models are available via the Mistral API or as open weights for local deployment.