Skip to content

AI Model Releases Timeline

A chronological timeline of major AI model releases

2017201820192020202120222023202420252026
Mistral AI
NVIDIA
poolside
DeepSeek
OpenAI
Xiaomi
Qwen
Moonshot AI
Anthropic
Zhipu AI
Google DeepMind
Alibaba Cloud
MiniMax
xAI
DeepSeek AI

2026

Mistral AIopen source128B denseMilestone

Mistral Medium 3.5: The 128B Open-Source Flagship for 2026

Released on April 29, 2026

New flagship model merging instruction-following, reasoning, and coding into a single 128B dense architecture

Released as open weights under a modified MIT license

Runs self-hosted on as few as four GPUs

API pricing at $1.50/mtok input and $7.50/mtok output

Powers the new Mistral Vibe remote agents for async cloud coding sessions

Drives Work mode in Le Chat for multi-step agentic task execution with parallel tool calling

Sessions can be spawned from CLI or Le Chat, and local CLI sessions can be teleported to the cloud

NVIDIAmultimodal30B-A3B (MoE)Open Source

NVIDIA Nemotron 3 Nano Omni: Open Multimodal AI Release

Released on April 28, 2026

Multimodal model unifying video, audio, image, and text understanding in a single architecture

Hybrid Mixture-of-Experts (MoE) 30B-A3B architecture with 30B total and 3B active parameters

Up to 9x higher throughput compared to similar open omnimodal models

256K unified context window with single-pass perception

Hybrid architecture combining Mamba layers for memory efficiency and transformers for precise reasoning

Integrates vision encoders (C3D for video) and audio encoders (Paraquet), eliminating need for separate models

Supports FP8/NVFP4 quantization with optimized inference on NVIDIA Ampere, Hopper, and Blackwell GPUs

Designed for enterprise multimodal agents: document intelligence (OCR, tables), GUI navigation, audio-video reasoning

Runs locally with 25-36GB RAM in 4/8-bit quantization via Unsloth or vLLM

Available on Hugging Face, Ollama, OpenRouter, and NVIDIA NIM

poolsidecoding model225B total (MoE), 23B activated per tokenClosedMilestone

Poolside Laguna-M.1: The 225B Coding Giant Arrives in 2026

Released on April 28, 2026

225B total parameter Mixture-of-Experts model with 23B activated parameters per token

Poolside most capable model to date, completed pre-training at end of 2025

Trained from scratch on 30T tokens using Muon optimizer

Trained on 6,144 interconnected NVIDIA Hopper GPUs entirely in-house

Achieves 72.5% on SWE-bench Verified, 67.3% on SWE-bench Multilingual, 46.9% on SWE-bench Pro, 40.7% on Terminal-Bench 2.0

128K context window with up to 8K output tokens

Agentic coding model built for long-horizon software engineering tasks

Foundation for the entire Laguna model family

Uses custom async on-policy RL system with Agent Client Protocol (ACP) server

Free to use for a limited time via poolside API and OpenRouter

Weights available on request for startups, institutions, and universities

poolsidecoding model33B total (MoE), 3B activated per tokenOpen SourceMilestone

Laguna-XS.2: Poolside's Open-Source Coding Model Release

Released on April 28, 2026

33B total parameter Mixture-of-Experts model with 3B activated parameters per token

First open-weight release from poolside, licensed under Apache 2.0

Trained on 30T tokens using Muon optimizer

Supports native reasoning with interleaved thinking between tool calls

Uses Sliding Window Attention with per-head gating in 30 of 40 layers

KV cache quantized to FP8 for reduced memory per token

Compact enough to run locally on a Mac with 36 GB RAM

128K context window with up to 8K output tokens

Achieves 68.2% on SWE-bench Verified, 62.4% on SWE-bench Multilingual, 44.5% on SWE-bench Pro, 30.1% on Terminal-Bench 2.0

Supports vLLM, Transformers, TRT-LLM, and Ollama

Agentic coding model built for long-horizon software engineering tasks

Free to use for a limited time via poolside API and OpenRouter

DeepSeekopen sourceV4-Pro: 1.6T total / 49B active (MoE) | V4-Flash: 284B total / 13B active (MoE)Milestone

DeepSeek-V4 Release: Open Source & Pricing

Released on April 24, 2026

Deux modèles : DeepSeek-V4-Pro (1.6T total / 49B active params) et DeepSeek-V4-Flash (284B total / 13B active params)

Context length de 1M tokens, output max de 384K tokens

Support thinking mode (par défaut) et non-thinking mode

Pricing ultra-agressif : Flash à $0.14/M input tokens (cache miss), $0.028/M (cache hit), $0.28/M output — soit ~7x moins cher que Claude Opus 4.7

Pro à $1.74/M input tokens (cache miss), $0.145/M (cache hit), $3.48/M output

Modèles open-source, poids disponibles sur HuggingFace

Compatible format API OpenAI et Anthropic (https://api.deepseek.com et https://api.deepseek.com/anthropic)

Support JSON output, Tool Calls, Chat Prefix Completion (Beta), FIM Completion (Beta)

Performance rivalisant avec les meilleurs modèles closed-source mondiaux

OpenAIlanguage modelUndisclosed (frontier model)ClosedMilestone

OpenAI Launches GPT-5.5: The Frontier of Agentic Intelligence

Released on April 23, 2026

GPT-5.5 is OpenAI smartest and most intuitive to use model yet, described as the next step toward a new way of getting work done on a computer

Achieves 82.7% on Terminal-Bench 2.0, 73.1% on Expert-SWE (Internal), and 84.9% on GDPval — all state-of-the-art scores

Matches GPT-5.4 per-token latency while performing at a much higher level of intelligence

Significantly more token efficient — uses fewer tokens to complete the same tasks compared to GPT-5.4

Scores 78.7% on OSWorld-Verified for real computer environment operation and 81.8% on CyberGym

GPT-5.5 Pro achieves 90.1% on BrowseComp and 52.4% on FrontierMath Tier 1-3

On SWE-Bench Pro, reaches 58.6% solving more tasks end-to-end in a single pass than previous models

Proactively deployed with industry-leading cybersecurity safeguards, classified as High under OpenAI Preparedness Framework

Helped discover a new proof about Ramsey numbers in combinatorics, later verified in Lean

Scores 25.0% on GeneBench for multi-stage scientific data analysis in genetics

API pricing: $5/1M input tokens and $30/1M output tokens with 1M context window

GPT-5.5 Pro API pricing: $30/1M input tokens and $180/1M output tokens

Co-designed, trained with, and served on NVIDIA GB200 and GB300 NVL72 systems

Rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex

GPT-5.5 Thinking unlocks faster help for harder problems with smarter, more concise answers

Outperforms Claude Opus 4.7 and Gemini 3.1 Pro on most coding and professional benchmarks

More than 85% of OpenAI now uses Codex every week across all company functions

Xiaomilanguage model1T+ total (42B active, MoE)ClosedMilestone

Xiaomi MiMo-V2.5-Pro: The 1T+ Parameter Agent Revolution

Released on April 22, 2026

Multimodal Mixture-of-Experts (MoE) architecture with 1T+ total parameters (42B active)

Extended context window up to 1M tokens

Native multimodal perception supporting text, images, video, and audio

Advanced autonomous agent capabilities handling 1000+ tool calls

40-60% better token efficiency compared to Claude Opus and GPT-5.x

ClawEval benchmark: 64% Pass@3 score

SWE-bench Pro: 57.2% task resolution rate

Surpasses Claude 4.6 Sonnet in coding tasks, approaches Claude Opus in agentic performance

Part of the MiMo-V2.5 family alongside MiMo-V2.5 and MiMo-V2.5-TTS

Available via mimo.mi.com with affordable token plans (monthly/annual subscriptions)

Qwenlanguage model27BOpen SourceMilestone

Qwen3.6-27B Release: 27B Dense Model Beats 397B on Coding

Released on April 22, 2026

27B dense open-source model with Apache 2.0 license

Surpasses Qwen3.5-397B-A17B on all major agentic coding benchmarks

SWE-bench Verified: 77.2 vs 76.2, Terminal-Bench 2.0: 59.3 vs 52.5, SkillsBench: 48.2 vs 30.0

Supports both multimodal thinking and non-thinking modes natively

Native vision-language support for images and video understanding

GPQA Diamond: 87.8, competitive with models several times its size

Compatible with OpenClaw, Claude Code, and Qwen Code coding assistants

Available on Hugging Face, ModelScope, and Alibaba Cloud Model Studio API

Moonshot AIopen sourceopen-weightsMilestone

Moonshot AI Unveils Kimi K2.6: Open-Source SOTA for Autonomous Agents

Released on April 20, 2026

Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python (86.7), Math Vision w/ python (93.2)

Long-horizon coding: 4,000+ tool calls, over 12 hours continuous execution

Generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization)

300 parallel sub-agents x 4,000 steps per run (up from K2.5: 100 / 1,500)

Proactive Agents: powers OpenClaw, Hermes Agent for 24/7 autonomous ops

Claw Groups research preview: bring your own agents, command friends bots & humans in the loop

API PRICING (use these EXACT values): kimi-k2.6 — Input $0.16/M tokens (cache hit), Input $0.95/M tokens (cache miss), Output $4.00/M tokens, Context Window 262,144 tokens

Sources: https://platform.moonshot.ai, https://kimi.com/blog/kimi-k2-6, https://huggingface.co/moonshotai/Kimi-K2.6

Live on kimi.com in chat and agent mode, plus Kimi Code at https://kimi.com/code for production-grade coding

AnthropicreasoningClosedMilestone

Anthropic Unveils Claude Opus 4.7: The New Reasoning Standard

Released on April 16, 2026

Most capable generally available Anthropic model for complex reasoning and agentic coding

High-resolution image support: 2576px / 3.75MP (up from 1568px / 1.15MP) with 1:1 pixel mapping

New "xhigh" effort level for coding and agentic use cases

Task budgets (beta) — advisory token budget across full agentic loops

128K max output tokens, 1M context window at standard pricing

+12 points on CursorBench coding benchmarks vs Opus 4.6

New tokenizer (up to ~35% more tokens per text, improved performance)

Adaptive thinking only — extended thinking budgets removed

Sampling parameters (temperature, top_p, top_k) removed

Pricing: $5/$25 per MTok input/output, batch $2.50/$12.50 per MTok

Zhipu AIreasoning744B MoE (40B active)Open SourceMilestone

GLM-5.1: The Open-Source Reasoning King Arrives

Released on April 7, 2026

#1 on SWE-Bench Pro (58.4%), beating GPT-5.4 and Claude Opus 4.6

Post-training upgrade to GLM-5 — same 744B MoE architecture (40B active)

Trained entirely on Huawei Ascend chips — no NVIDIA hardware

MIT license, compatible with Claude Code and OpenClaw

202K context window, strong on cybersecurity (CyberGym 68.7%)

Anthropiclanguage modelClosed

Anthropic Unveils Claude Opus 4.6 Fast: Speed Meets Smarts

Released on April 7, 2026

Faster variant of Claude Opus 4.6 with comparable intelligence

AnthropicreasoningClosedMilestone

Anthropic Unveils Claude Mythos Preview: The Capybara Tier Breakthrough

Released on April 7, 2026

New Capybara tier above Opus — the most powerful Anthropic model

93.9% on SWE-bench Verified, 77.8% on SWE-bench Pro

97.6% on USAMO 2026, 94.5% on GPQA Diamond

1M context window, limited preview for ~50 partner organizations

Google DeepMindopen source31BMilestone

Google Unveils Gemma 4: The Apache 2.0 Frontier

Released on April 2, 2026

Google's most capable open models, built from Gemini 3 research

Four sizes: E2B, E4B, 26B MoE (3.8B active), 31B Dense

First Gemma release under Apache 2.0 license

Native multimodal, 140+ languages, up to 256K context

Agent-ready with function calling and structured JSON output

Zhipu AImultimodalClosed

GLM-5V Turbo Review: Zhipu's Multimodal Coding Powerhouse

Released on April 1, 2026

Vision + Code model from Z.ai

Multimodal coding capabilities

API only

Alibaba Cloudlanguage modelClosed

Qwen 3.6 Plus Review: Speed, Code, and 1M Context

Released on March 31, 2026

1M token context window with always-on chain-of-thought reasoning

78.8% on SWE-bench Verified — competitive with Claude Opus 4.6

2-3x faster output speed than Claude Opus 4.6

Free preview via OpenRouter, successor to Qwen 3.5

Mistral AImultimodalOpen Source

Mistral Unleashes Voxtral TTS: The Open-Weight Voice AI That Challenges ElevenLabs

Released on March 23, 2026

Mistral's first audio model — direct competitor to ElevenLabs

Zero-shot voice cloning with multilingual support

Real-time streaming capabilities

Open weights under CC BY-NC 4.0 (non-commercial)

Xiaomireasoning309B MoEOpen Source

Xiaomi MiMo-V2-Pro: The 309B MoE Reasoning Powerhouse

Released on March 18, 2026

Xiaomi reasoning model with strong math and code performance

309B MoE architecture

MiniMaxcoding model230B MoE (10B active)Open Source

MiniMax M2.7: The Self-Evolving Coding Model Revolution

Released on March 18, 2026

Self-evolving agent model — first to participate in its own development

56.22% on SWE-Pro, matching GPT-5.3-Codex

57.0% on Terminal Bench 2, GDPval-AA ELO 1495 (highest open-source)

230B MoE (10B active), 200K context, open weights on HuggingFace

Agent Teams for native multi-agent collaboration

OpenAIlanguage modelClosed

OpenAI Releases GPT-5.4 Mini: Efficient AI for Developers

Released on March 17, 2026

Efficient variant of GPT-5.4 with native computer use

Lower cost while maintaining strong reasoning capabilities

Mistral AIcoding model119B MoE (6.5B active)Open Source

Leanstral by Mistral AI: The Open-Source Proof Agent Revolutionizing Code Verification

Released on March 16, 2026

First open-source code agent for Lean 4 formal proof engineering

Generates code AND machine-checkable mathematical proofs

119B MoE with 6.5B active, outperforms Claude Sonnet 4.6 on FLTEval

Apache 2.0 license, 15x cheaper than Claude Opus for formal verification

Mistral AIopen source119B MoE (6.5B active)

Mistral Small 4: The Unified Frontier Model Released

Released on March 16, 2026

Unifies instruct, reasoning, coding, and multimodal in a single model

119B MoE with 6.5B active parameters, 256K context window

Replaces Magistral (reasoning), Pixtral (vision), and Devstral (coding)

Apache 2.0 license, configurable reasoning parameter

xAIlanguage modelClosed

xAI Grok 4.20: The Parallel Agent Revolution

Released on March 12, 2026

Beta release with parallel agents architecture

500K context window

Iterative improvement via user feedback

NVIDIAopen source120B MoE (12B active)

NVIDIA Nemotron 3 Super: The Open MoE Powerhouse for Enterprise Agents

Released on March 11, 2026

Open MoE model from NVIDIA

120B total parameters with 12B active

Strong enterprise performance

OpenAIlanguage modelClosed

OpenAI GPT-5.4 Series: The 1M Token Frontier Model

Released on March 6, 2026

Latest OpenAI flagship with 1M token context window

Available in Standard, Mini, and Nano variants

Supports reasoning effort with 4 effort levels

128K max output tokens

Prompt caching with $0.02-$0.25/M cached read

Google DeepMindlanguage modelClosed

Gemini 3.1 Flash Lite Preview: The Speed-Cost King for Developers

Released on March 3, 2026

Google's high-efficiency model optimized for high-volume use cases

1M token context window, 65.5K max output

Supports prompt caching, reasoning effort, and reasoning budget

Native tool calling and vision capabilities

Google DeepMindmultimodalClosed

Gemini 3.1 Pro Release: Google's 2026 AI Flagship

Released on February 19, 2026

Google's latest flagship model

More than doubles reasoning performance over Gemini 3 Pro

Released in preview via Gemini API, AI Studio, and Vertex AI

xAIlanguage modelClosed

xAI Grok 4.2: The 2026 AI Revolution for Developers

Released on February 17, 2026

Beta release with rapid learning architecture — improves weekly via user feedback

256K context window

4-agent parallel reasoning

Medical document analysis added

Anthropiclanguage modelClosed

Anthropic Unveils Claude Sonnet 4.6: The Ultimate Developer Model

Released on February 17, 2026

Most capable Sonnet yet with full upgrade across coding, computer use, long-context reasoning

1M token context window in beta

200K token context window, 64K max output

Supports prompt caching, reasoning effort, and reasoning budget

Native tool calling and vision capabilities

Alibaba Cloudlanguage model397B MoE (17B active)Closed

Qwen 3.5 Release: 397B MoE Agentic Powerhouse Review

Released on February 14, 2026

Agentic AI model with built-in tools for web search and code execution

1M token context window

Qwen3.5-Plus hosted; open weights planned

MiniMaxcoding model230B MoE (10B active)Open Source

MiniMax M2.5: The Open-Source Frontier Coding Model for 2026

Released on February 12, 2026

Frontier MoE model with 80.2% on SWE-Bench Verified

Strong coding and agentic capabilities

230B total parameters, 10B activated per token

DeepSeek AIopen source671B MoE

DeepSeek V3.2: The 671B MoE Open Source King

Released on February 12, 2026

Major update to the V3 series with 1M token context

671B MoE focused on code generation and reasoning improvements

Open weights on HuggingFace, MIT license

Zhipu AIreasoningOpen Source

GLM-5: Zhipu AI's Open-Source Reasoning Frontier

Released on February 11, 2026

China's first public AI company frontier model

Targets complex systems engineering and long-horizon agentic tasks

OpenBMBmultimodal9BOpen Source

MiniCPM-o 4.5: 9B Multimodal AI Model Release

Released on February 8, 2026

On-device multimodal LLM with full-duplex real-time audio, image, video

Built on Qwen3-8B architecture

Gemini 2.5 Flash level performance at only 9B parameters

OpenAIcoding modelClosed

OpenAI Unveils GPT-5.3-Codex: The Ultimate Agentic Developer Assistant

Released on February 5, 2026

Most capable agentic coding model from OpenAI

Available via Codex app, CLI, IDE extensions

Optimized for software engineering workflows

AnthropicreasoningClosedMilestone

Claude Opus 4.6: The Agentic Reasoning Breakthrough

Released on February 5, 2026

Huge leap for agentic planning with parallel subtask execution

Tool and subagent orchestration capabilities

Terminal-Bench record holder

1M token context window, 32K max output

State-of-the-art agentic AI behaviors

StepFunreasoning196B MoE (11B active)Open Source

Step-3.5-Flash: The Open-Source Reasoning King

Released on February 1, 2026

Open-source sparse MoE with 3-way Multi-Token Prediction

100-350 tok/s generation speed

Frontier reasoning at low cost

Arcee AIopen source400B MoE (13B active)

Arcee AI Unveils Trinity Large: The 400B Open Source MoE Powerhouse

Released on January 27, 2026

400B sparse MoE with only 13B active parameters

Built in the US with open weights

One of the largest open-source foundation models

Apache 2.0 license

Alibaba CloudreasoningClosed

Qwen3-Max-Thinking: Alibaba's New Reasoning Powerhouse

Released on January 27, 2026

Top-tier reasoning model with adaptive tool use

Retrieves information and runs code during inference

Rivals leading frontier models

Moonshot AIopen source1T MoE (32B active)

Moonshot AI Unveils Kimi K2: The 1T Parameter Open-Source Giant

Released on January 20, 2026

Massive 1T MoE with 32B active parameters

First open-weight model to rank #1 on LMSYS Chatbot Arena

2M token context window, 200+ language support

$0.15/$2.50 per 1M tokens, Modified MIT license

Sarvam AIlanguage model2BOpen Source

Sarvam-2B: India's Lightweight Sovereign LLM for Edge Deployment

Released on January 15, 2026

India's multilingual LLM — part of sovereign AI initiative

Supports 10+ Indian languages natively

2025

Upstageopen source102B MoE (12B active)

Upstage Unveils SOLAR 102B: Korea's Open Frontier Model

Released on December 31, 2025

Korea's answer to open frontier models

102B MoE model with 12B active parameters

Google DeepMindlanguage modelClosed

Gemini 3 Flash: The Speed Revolution from Google DeepMind

Released on December 17, 2025

Fast frontier-class model rivaling larger models at a fraction of the cost

Default model in the Gemini app

Allen AImultimodal8BOpen Source

Molmo 2: The Open-Source Multimodal Revolution from Allen AI

Released on December 16, 2025

Multimodal model from AI2

Fully open weights, data, and code

Xiaomireasoning309B MoEOpen Source

Xiaomi MiMo V2 Flash: The Open-Source Reasoning Powerhouse

Released on December 16, 2025

Xiaomi large reasoning model

309B MoE architecture

Strong on math and code

OpenAIlanguage modelClosedMilestone

OpenAI GPT-5.2 Release: A Milestone in AI Reasoning and Multimodal Power

Released on December 11, 2025

Improved reasoning and multimodal capabilities over GPT-5.1

Enhanced mental health protections

128K max output tokens

Available on Plus ($20/month), Pro ($200/month), and API

Expert-level performance on 44 knowledge work tasks

Mistral AIcoding model24BOpen Source

Devstral Small 2 Release: Mistral AI's New 24B Coding Powerhouse

Released on December 9, 2025

Successor to Devstral Small 1, derived from Mistral Small 3.1

Portable coding agent

Apache 2.0 license

Mistral AIcoding model123BOpen Source

Mistral AI Unveils Devstral 2: The 123B Coding Giant

Released on December 9, 2025

Next-gen coding model with top SWE-Bench score

Modified MIT license (free unless high revenue)

Mistral AImultimodal14BOpen Source

Mistral AI Unveils Ministral 3 14B: The Multimodal Powerhouse

Released on December 2, 2025

Largest Ministral 3 model with vision

Best-in-class text and vision capabilities

Apache 2.0 license

Mistral AIlanguage model8BOpen Source

Ministral 3 8B: The Open-Source Vision Frontier

Released on December 2, 2025

Powerful and efficient model with vision

Best-in-class text and vision at this size

Apache 2.0 license

Mistral AIlanguage model3BOpen Source

Ministral 3 3B: The Edge AI Revolution Arrives

Released on December 2, 2025

Tiny and efficient edge model with vision

Runs on phones, drones, and laptops

Apache 2.0 license

Amazonlanguage modelClosed

Amazon Nova 2: The Next-Gen LLM for AWS Bedrock

Released on December 2, 2025

Amazon next-gen foundation model

Available via AWS Bedrock

Announced at re:Invent

Mistral AIlanguage model41B active (MoE)Open Source

Mistral Large 3: The Open-Weight Frontier Model for 2025

Released on December 2, 2025

Sparse MoE with 41B active parameters

Open weights

Strong reasoning and multilingual capabilities

Zhipu AIcoding modelOpen Source

GLM-4.7 Release: Zhipu AI Open-Source Coding Model

Released on December 1, 2025

Open-weights model topping global coding and reasoning leaderboards

Includes GLM-4.7 Flash variant

Cost-effective compared to Western competitors

MiniMaxcoding model230B MoE (10B active)Open Source

MiniMax M2.1: The 230B Open Source Coding Model That Disrupts Pricing

Released on December 1, 2025

Fully open-source SOTA coding model

230B params MoE architecture, 10B activated per token

SWE-bench score of 74.0%

92% cheaper than Western alternatives

AnthropicreasoningClosedMilestone

Anthropic Unveils Claude Opus 4.5: The Reasoning Milestone

Released on November 24, 2025

Exceeds Sonnet 4.5 by 4.3% using 48% fewer tokens at max effort

200K token context, 64K max output

Hybrid reasoning with instant or extended thinking

Multimodal: text, image, and audio support

20% accuracy gain, Excel and financial modeling breakthrough

Allen AIopen source32B

Allen AI Unveils OLMo 3: The 32B Open-Source Powerhouse

Released on November 20, 2025

Fully open model with weights, data, and training code

From AI2 research lab

Deep Cogitoreasoning671B MoEOpen Source

Deep Cogito Releases Cogito v2.1: 671B MoE Reasoning Powerhouse

Released on November 19, 2025

Large 671B MoE reasoning model

Strong on complex reasoning tasks

Google DeepMindreasoningClosed

Gemini 3 Deep Think: Google's New Reasoning Powerhouse

Released on November 18, 2025

Reasoning variant of Gemini 3

Deep chain-of-thought for complex scientific problems

Google DeepMindmultimodalClosedMilestone

Google DeepMind Unveils Gemini 3 Pro: The Multimodal Leap of 2025

Released on November 18, 2025

Over 50% improvement over Gemini 2.5 Pro

Most powerful Google model — replaces 2.5 series

1M token context window

Advanced multimodal: text, image, video, audio, code

OpenAIlanguage modelClosed

OpenAI Unveils GPT-5.1: Smarter, Faster, and More Conversational

Released on November 12, 2025

Family of four models with adaptive reasoning

Faster, more conversational, improved coding

Rolled out to all ChatGPT users

Moonshot AIreasoningClosed

Moonshot AI Unveils Kimi K2.5: The New Reasoning King

Released on November 6, 2025

Upgraded Kimi model with thinking and reasoning capabilities

Amazonlanguage modelClosed

Amazon Nova Premier: The 1M Context Multimodal Powerhouse

Released on October 31, 2025

Most capable Amazon model

1M context window

Multimodal capabilities

Teacher for distillation on Bedrock

Yandexlanguage modelClosed

Yandex Alice AI 1.0: The Global LLM Breakthrough

Released on October 28, 2025

First major Russian-developed large language model on the global stage

From Yandex

MiniMaxopen source230B MoE

MiniMax M2: 230B Open-Source Model Released 2025

Released on October 23, 2025

Upgraded MiniMax model with improved reasoning and generation

Open weights

Zhipu AIlanguage modelOpen Source

GLM-4.6 Release: Zhipu AI's Domestic Chip Powerhouse

Released on October 9, 2025

First GLM model with native support for China domestic chips

Cambricon and Moore Threads support

FP8 and Int4 quantization

IBMopen source

IBM Granite 4.0: The Hybrid Mamba-2 Revolution for Enterprise AI

Released on October 2, 2025

IBM open enterprise model

Hybrid Mamba-2 Transformer architecture

Apache 2.0 license

Anthropiclanguage modelClosed

Anthropic Unveils Claude Haiku 4.5: The Speed King Arrives

Released on October 1, 2025

Anthropic's fastest model with near-frontier intelligence

200K token context window, 64K max output

21K+ tokens per second for prompts under 32K tokens

Supports reasoning budget and effort control

Most cost-effective in the Claude family: $1/M input

DeepSeek AIopen source671B MoE

DeepSeek V3.2: The Open-Source Challenger to GPT-5

Released on September 29, 2025

Further iteration on V3 series

Enhanced capabilities across all benchmarks

Open weights

Anthropiccoding modelClosed

Anthropic Unveils Claude Sonnet 4.5: The New King of Code

Released on September 29, 2025

Anthropic's best model for coding tasks

1M token context window (beta feature)

64K max output tokens

Strong agentic behavior and computer-use skills

Optimized for efficient coding and parallel processing

Alibaba Cloudopen source80B MoE (3B active)

Qwen3-Next: The 80B MoE Revolution for Local Deployment

Released on September 10, 2025

Ultra-efficient MoE from Alibaba

80B total, only 3B active parameters

Strong reasoning with minimal compute

Apache 2.0 license

Moonshot AIopen source1T MoE (32B active)Milestone

Moonshot AI Releases Kimi K2: A 1T MoE Open-Source Breakthrough

Released on September 4, 2025

Massive 1T MoE model with open weights

Highly competitive with frontier models

Major Chinese AI milestone

32B activated parameters

Cost-effective: ~$0.15/M input, $2.50/M output

Strong coding performance across 32+ languages

xAIlanguage modelClosed

xAI Grok 4 Fast Release: 98% Cheaper, Faster AI

Released on September 1, 2025

98% cost reduction compared to Grok 4 Standard

40% increase in token efficiency

Real-time search integration via X

$0.20/M input, $1.50/M output

Mistral AIreasoning~45BClosed

Magistral Medium 1.2: Vision-Enhanced Reasoning Power Unveiled

Released on September 1, 2025

Adds vision to Magistral Medium

Multimodal frontier reasoning

Closed API only

Mistral AIreasoning24BOpen Source

Mistral AI Unveils Magistral Small 1.2: The Multimodal Reasoning Powerhouse

Released on September 1, 2025

Adds vision to Magistral Small

Multimodal reasoning model

Apache 2.0 license

NousResearchopen source405B

Hermes 4 405B: The New Open-Source Reasoning Giant

Released on August 28, 2025

Latest in the Hermes series

Advanced function calling and structured output

Built on Llama 3.1

DeepSeek AIopen source671B MoE

DeepSeek V3.1: The Open-Source GPT-5 Rival Released

Released on August 21, 2025

Major upgrade to V3 with improved reasoning and coding

Open weights

Mistral AImultimodalClosedMilestone

Mistral Medium 3.1 Release: Frontier Multimodal AI Breakthrough

Released on August 12, 2025

Frontier-class multimodal model

Competitive with GPT-4o and Claude 3.5

Strong vision and reasoning capabilities

Zhipu AImultimodal106BOpen Source

Zhipu AI Unveils GLM-4.5V: 106B Open Source Multimodal Powerhouse

Released on August 11, 2025

Vision-language model from Z.ai

106B parameters with strong multimodal understanding

OpenAIlanguage modelClosedMilestone

OpenAI GPT-5 Release: The 2025 AI Revolution

Released on August 7, 2025

Next-generation flagship with major intelligence leap

400K token context window

Built-in reasoning with 4 effort levels

Multimodal: text, image, and video-based reasoning

Available in Standard, Mini, and Nano variants

OpenAIopen source120BMilestone

OpenAI Releases GPT-OSS: The Historic Open-Weight Shift

Released on August 5, 2025

OpenAI's first open-weight models since GPT-2

20B and 120B variants

Historic open-source move from OpenAI

AnthropicreasoningClosed

Anthropic Unveils Claude Opus 4.1: The Reasoning King

Released on August 5, 2025

Upgrade to Claude 4 with improved coding and instruction following

200K token context window

Extended thinking support

Vision and tool calling capabilities

Anthropiclanguage modelClosed

Anthropic Unveils Claude 4.5 Sonnet: The New Coding Powerhouse

Released on July 29, 2025

Newest Anthropic model with improved creative writing

Enhanced nuance and multi-step reasoning

Zhipu AIlanguage model106B MoEOpen Source

Zhipu AI Unveils GLM-4.5 Air: The Efficient 106B MoE Powerhouse

Released on July 28, 2025

Lightweight variant of GLM-4.5

106B MoE, efficient inference on 8x H20 GPUs

Zhipu AIopen source355B MoE

Zhipu AI Unveils GLM-4.5: The 355B MoE Open Source Powerhouse

Released on July 28, 2025

Z.ai flagship open MoE model

355B total parameters

Strong reasoning, coding, and agentic capabilities

Claimed cheaper to run than DeepSeek

xAIlanguage modelClosedMilestone

xAI Grok 4 Release: The 2025 Reasoning Milestone

Released on July 11, 2025

xAI's most powerful model at the time

Major reasoning leap

Trained on expanded Colossus cluster

Google DeepMindopen source4B

Google Unveils Gemma 3n: The 4B Edge AI Breakthrough

Released on June 26, 2025

Efficient on-device model designed for mobile

Runs on phones and edge devices

OpenAIreasoningClosed

OpenAI Unveils GPT-o3 Pro: The New Reasoning King

Released on June 10, 2025

Most powerful OpenAI reasoning model

Extended thinking for frontier problems

Mistral AIlanguage model24BOpen Source

Mistral Small 3.2: The 24B Open-Source Reasoning King

Released on June 10, 2025

Update to Mistral Small 3.1

Improved instruction following and reasoning

Apache 2.0 license

Xiaohongshu (RedNote)open source142B MoE (14B active)

Xiaohongshu Releases dots.llm1: 142B MoE Open Source Breakthrough

Released on June 6, 2025

Open-source MoE from RedNote (China Instagram)

142B total, 14B active

Performance on par with frontier models at time of release

Mistral AIreasoning24BOpen Source

Mistral AI Unveils Magistral Small: 24B Reasoning Powerhouse

Released on June 5, 2025

Mistral reasoning model with extended thinking

Strong STEM performance

Apache 2.0 license

Google DeepMindmultimodalClosed

Gemini 2.5 Pro (06-05): The New Frontier for Agentic AI

Released on June 5, 2025

Latest 2.5 Pro with enhanced coding, reasoning, and agentic capabilities

MiniMaxlanguage modelOpen Source

MiniMax-M1: The Open-Source Hybrid Attention Breakthrough

Released on June 1, 2025

Chinese AI lab flagship with strong long-context

Lightning attention architecture

Anthropiclanguage modelClosed

Anthropic Unveils Claude Sonnet 4: The New Coding Powerhouse

Released on May 22, 2025

High-performance model balancing speed and intelligence

200K context window, 64K max output

Best model for complex agents and coding

Native tool calling and computer use

Available on free tier of Claude.ai

AnthropicreasoningClosedMilestone

Anthropic Unveils Claude Opus 4: The New Reasoning King

Released on May 22, 2025

Most powerful Anthropic model at launch

Parallel tool use, long autonomous tasks

200K token context window

Extended thinking support

Vision capabilities for image understanding

Mistral AIcoding model24BOpen Source

Mistral Devstral: The New 24B Coding Model for AI Engineers

Released on May 21, 2025

Mistral dedicated coding model

Optimized for software engineering and agentic coding tasks

Apache 2.0 license

TIIopen source0.5B–34B

Falcon H1: TII's Hybrid SSM Powerhouse Release

Released on May 20, 2025

Hybrid SSM+attention architecture

Six model sizes from 0.5B to 34B

Punches above weight class on benchmarks

Apache 2.0 license

Google DeepMindlanguage modelClosed

Gemini 2.5 Flash: Google's Speed King Arrives with Controllable Reasoning

Released on May 20, 2025

Cost-efficient reasoning with controllable thinking depth

#1 Chatbot Arena for speed

Mistral AIlanguage modelOpen Source

Mistral Medium 3: The Open-Source GPT-4o Challenger

Released on May 14, 2025

Front-tier model, competitive with GPT-4o

Strong multilingual capabilities

Apache 2.0 license

Alibaba Cloudopen source235B MoE (22B active)

Qwen 3 Release: The 235B MoE Powerhouse for Developers

Released on April 29, 2025

Excellent multilingual performance (Chinese, English, and more)

0.6B to 235B variants with hybrid thinking

119 languages supported

22B active parameters in MoE architecture

Strong coding performance

Apache 2.0 license

Zhipu AImultimodal32BOpen Source

Zhipu GLM-4.1V: Open-Source Multimodal Reasoning Powerhouse

Released on April 25, 2025

Open 32B and 9B multimodal with reasoning

Competitive on vision tasks

OpenAIreasoningClosed

OpenAI o4-mini: The New King of Efficient Reasoning for Developers

Released on April 16, 2025

Efficient reasoning model

Best cost-performance for coding and STEM

OpenAIreasoningClosed

OpenAI o3: The Ultimate Reasoning Model Released for Developers

Released on April 16, 2025

Full o3 reasoning model — successor to o1

Deep chain-of-thought capabilities

OpenAIlanguage modelClosed

OpenAI Unveils GPT-4.1 Series: 1M Context & Coding Power

Released on April 14, 2025

Optimized for coding and instruction following

1M token context window

Available in Standard, Mini, and Nano variants

Nano: $0.10/M input, $0.40/M output

Meta AIopen source400B+ MoEMilestone

Meta Unveils Llama 4: The Open-Source Multimodal Milestone

Released on April 5, 2025

Open-weight natively multimodal models

Scout: 109B, runs on single H100 GPU, 10M token context

Maverick: 400B, requires H100 DGX system

Early fusion for native text, image, and video understanding

Google DeepMindmultimodalClosedMilestone

Google DeepMind Unveils Gemini 2.5 Pro: The Multimodal Reasoning Benchmark

Released on March 25, 2025

#1 on LMArena at launch

Built-in reasoning capabilities

1M token context window

Native code execution and Google Search grounding

Best overall model at launch

NVIDIAreasoning253B MoEOpen Source

NVIDIA Unveils Nemotron Ultra: The 253B MoE Reasoning Powerhouse

Released on March 18, 2025

Open reasoning model based on Llama

253B MoE architecture

Strong enterprise tasks

Mistral AIopen source24B

Mistral Small 3.1: The Multimodal Leap in Open-Source AI

Released on March 17, 2025

Adds vision capabilities to Small 3.0

Multimodal, 128K context

Apache 2.0 license

Coherelanguage model111BOpen Source

Command A: Cohere's 111B Open Source Enterprise Model

Released on March 13, 2025

Cohere's 111B flagship model

Enterprise RAG and agentic tasks

Multilingual capabilities

Runs on 2 GPUs

Google DeepMindmultimodal27BOpen Source

Gemma 3 Release: Google DeepMind's Multimodal Open-Weight Revolution

Released on March 12, 2025

1B/4B/12B/27B variants

Multimodal (text+vision)

Single GPU capable, 128K context

Shanghai AI Labopen source8B

InternLM 3 8B Release: Deep Thinking & Apache 2.0

Released on March 5, 2025

8B bilingual (English + Chinese) model with deep thinking mode

Surpasses Llama 3.1 8B and Qwen2.5 7B on reasoning/knowledge tasks

128K context, trained on 4T tokens with 75%+ cost savings

Apache 2.0 license

Alibaba Cloudreasoning32BOpen Source

QwQ-32B: Alibaba Cloud's New Open-Source Reasoning Powerhouse

Released on March 5, 2025

Dedicated reasoning model from Qwen team

Strong mathematical and logical reasoning

Apache 2.0 license

OpenAIlanguage modelClosed

OpenAI GPT-4.5 Release: The EQ-Driven Leap in AI Reasoning

Released on February 27, 2025

Largest OpenAI model at the time

Focus on EQ, creativity, reduced hallucinations

Anthropiccoding modelClosed

Anthropic Unveils Claude 3.7 Sonnet: The New Coding Powerhouse

Released on February 24, 2025

Hybrid reasoning — toggle instant/extended thinking

Best coding model at launch

200K context window, 64K max output

Microsoftopen source3.8B

Microsoft Phi-4-Mini: 3.8B Open Source Model Release

Released on February 18, 2025

3.8B dense model outperforming 2x-size models (Phi-3.5-mini, Llama 3.2 3B)

128K context, 22 languages, function calling and tool use

Trained on 5T tokens (synthetic + filtered public data + code)

MIT license — smallest Phi model with strong reasoning

xAIlanguage modelClosed

xAI Grok 3 Release: 100K GPUs Meet Advanced Reasoning

Released on February 17, 2025

Trained on Colossus supercluster (100K GPUs)

Strong reasoning capabilities

DeepSeek AIreasoning671B MoEOpen SourceMilestone

DeepSeek R1: The Open-Source Reasoning Revolution

Released on January 20, 2025

Open-source reasoning model rivaling o1

Pure reinforcement learning approach

Caused global market shockwaves

671B MoE architecture

Mistral AIlanguage model24BOpen Source

Mistral Small 3.0: The Open-Source Frontier Model Arrives

Released on January 15, 2025

Refreshed Small with state-of-the-art performance

Apache 2.0 license

Allen AIopen source7B / 13B

Allen AI Unveils OLMo 2: The New Standard in Open-Source LLMs

Released on January 6, 2025

Truly open: weights + training data + training code + evaluation all released

7B and 13B sizes — 7B competitive with Llama 3.1 8B, 13B with Gemma 2 9B

Trained on 4T–5T tokens, 9-point MMLU increase over OLMo 1

Apache 2.0 license

2024

DeepSeek AIopen source671B MoEMilestone

DeepSeek V3: The $5.5M Model That Rivals GPT-4o

Released on December 26, 2024

671B MoE trained for $5.5M — matches GPT-4o/Claude 3.5 Sonnet

Revolutionized cost efficiency

Open-source on GitHub and HuggingFace

Strong coding and mathematical reasoning

TIIopen source10B

Falcon 3 10B: The New Open-Source Powerhouse from TII

Released on December 17, 2024

1B/3B/7B/10B sizes

Enhanced multilingual and multimodal

Apache 2.0 license

Microsoftopen source14B

Microsoft Unveils Phi-4: 14B Open-Source Powerhouse

Released on December 12, 2024

14B excelling at STEM reasoning

Outperforms much larger models on math

Google DeepMindmultimodalClosed

Gemini 2.0 Flash: Google's Agentic Leap into Multimodal Speed

Released on December 11, 2024

Google's model for the agentic era with native image and audio generation

Outperforms Gemini 1.5 Pro at twice the speed

Native tool use including Google Search and code execution

Foundation for Project Astra and Project Mariner

Meta AIopen source70B

Meta Unveils Llama 3.3: 70B Model Matches 405B Performance

Released on December 6, 2024

70B matching Llama 3.1 405B performance

Massive efficiency gain

OpenAIreasoningClosed

OpenAI o1-Pro Release: The New Standard for Complex Reasoning

Released on December 5, 2024

Enhanced reasoning with more compute for complex tasks

Available in ChatGPT Pro tier

Amazonlanguage modelClosed

Amazon Nova 2024: The New Enterprise LLM Standard

Released on December 3, 2024

Foundation model family: Micro/Lite/Pro/Premier

Multimodal, optimized for AWS Bedrock

Alibaba Cloudcoding model0.5B–32BOpen Source

Qwen2.5-Coder: Open Source Coding LLM Rivals GPT-4o

Released on November 22, 2024

Code-specialized model in 6 sizes: 0.5B, 1.5B, 3B, 7B, 14B, 32B

32B variant matches GPT-4o coding ability — state-of-the-art open code LLM

Trained on 5.5T tokens (source code + text-code grounding + synthetic)

300+ programming languages, 128K context with YaRN extension

Apache 2.0 license

Mistral AImultimodal124BOpen Source

Mistral Pixtral Large: The 124B Multimodal Open-Source Frontier

Released on November 17, 2024

Mistral's large multimodal model

128K context, native image understanding at scale

Open weights

Tencentopen source389B MoE (52B active)

Tencent Unveils Hunyuan-Large: 389B MoE Model Challenges Llama 3.1

Released on November 5, 2024

Largest open-source Transformer-based MoE model at release

389B total parameters with 52B active per token

256K context window

Outperforms Llama 3.1 405B on benchmarks

Anthropiclanguage modelClosed

Anthropic Releases Claude Haiku 3.5: Fast & Efficient

Released on October 22, 2024

Fast and cost-effective model

200K token context window, 8K max output

Multilingual and vision capabilities

$0.80/M input, $4/M output

Ideal for high-volume tasks like chatbots and moderation

01.AIlanguage modelClosed

01.AI Yi-Lightning Release: Top-Tier Proprietary Model Analysis

Released on October 16, 2024

Ranked #6 on LMSYS Chatbot Arena at launch — #1 in China

Surpassed GPT-4o-0513 and Claude 3.5 Sonnet in overall ranking

Top-3 in Chinese, Math, Coding, and Hard Prompts categories

Founded by Kai-Fu Lee, proprietary model

Meta AImultimodal90BOpen Source

Meta Unveils Llama 3.2: Multimodal Leap for Developers

Released on September 25, 2024

First Llama models with vision capabilities — 11B and 90B multimodal variants

Lightweight 1B and 3B edge models for on-device deployment

128K context window, competitive with Claude 3 Haiku and GPT-4o-mini

Drop-in replacements for Llama 3.1 text models

Alibaba Cloudopen source72B

Qwen2.5 Release: The 72B Open-Source Coding Powerhouse

Released on September 19, 2024

0.5B to 72B range

SOTA open model for coding and math

18T training tokens

Apache 2.0 license

Mistral AIopen source22B

Mistral Small 2409: The 22B Open-Source Powerhouse Released September 2024

Released on September 18, 2024

Updated Mistral Small with improved instruction following

22B parameters, Apache 2.0 license

Mistral AImultimodal12BOpen Source

Mistral Pixtral 12B: The Open-Source Multimodal Breakthrough

Released on September 17, 2024

Built on NeMo architecture with native vision support

128K context, Apache 2.0 license

OpenAIreasoningClosedMilestone

OpenAI o1-Preview: The Reasoning Model Revolution

Released on September 12, 2024

First 'reasoning' model with chain-of-thought at inference

PhD-level science and math performance

DeepSeek AIopen source236B MoE (21B active)

DeepSeek V2.5 Release: The 236B MoE Powerhouse for Developers

Released on September 5, 2024

Merged DeepSeek-V2-Chat and DeepSeek-Coder-V2 into a single model

236B MoE with 21B active parameters, 128K context

Strong coding and general capabilities in one model

MIT license, available on HuggingFace

AI21 Labsopen source398B MoE (94B active)

Jamba 1.5: The New Hybrid MoE Standard for Long Context

Released on August 22, 2024

Mamba-Transformer hybrid MoE

94B active, 256K context

Fastest long-context model at release

Microsoftopen source4B MoE

Phi-3.5 Release: Microsoft's 4B MoE Model for Edge AI

Released on August 20, 2024

4B MoE and 3.8B variants optimized for edge devices

Phone-capable AI with 128K context window

Improved multilingual support over Phi-3

Strong reasoning for its size class

xAIlanguage modelClosed

xAI Grok-2 Release: Competing with GPT-4o and Claude 3.5

Released on August 13, 2024

Competitive with GPT-4o and Claude 3.5 Sonnet

Available on X platform

Naverlanguage model104BClosed

HyperCLOVA X: Naver's 104B Korean LLM Review

Released on August 7, 2024

Korean web giant Naver's flagship LLM optimized for Korean language and culture

Two sizes: HCX-L (largest) and HCX-S (lighter), built on LLaMA 2 architecture

100K context window with Korean-optimized tokenizer

Strong cross-lingual reasoning in Asian languages — Korean, Japanese, Chinese

Black Forest Labsimage generation12BOpen Source

FLUX.1 by Black Forest Labs: The Open Source Image King

Released on August 1, 2024

State-of-the-art text-to-image model from ex-Stability AI founders

12B rectified flow transformer architecture

FLUX.1 [schnell] open under Apache 2.0, [dev] non-commercial

Surpassed closed-source alternatives in image quality

Mistral AIlanguage model123BOpen Source

Mistral Large 2: The 123B Open-Weight Frontier Model

Released on July 24, 2024

128K context, competitive with GPT-4o and Llama 3.1 405B

12 languages supported

Open weights

Meta AIopen source405BMilestone

Meta Llama 3.1: The 405B Open-Source Benchmark

Released on July 23, 2024

Largest open model — 405B parameters

Matches GPT-4 on many benchmarks

128K context window

Mistral AI & NVIDIAopen source12B

Mistral NeMo 12B: The New Standard for Efficient Open-Source AI

Released on July 18, 2024

Co-built with NVIDIA, runs on a single GPU

12B parameters with 128K context window

Drop-in replacement for Mistral 7B with SOTA performance in its class

Apache 2.0 license, strong multilingual support

Shanghai AI Labopen source20B

InternLM 2.5 Release: Open-Source Reasoning Powerhouse from Shanghai AI Lab

Released on July 3, 2024

Strong reasoning from China's national lab

Competitive on math and coding

Google DeepMindopen source27B

Gemma 2 Release: Google's New Open-Source AI Model

Released on June 27, 2024

9B and 27B sizes

Outperforms models 2x its size

Knowledge distillation from Gemini

Anthropiclanguage modelClosedMilestone

Anthropic Unveils Claude 3.5 Sonnet: The Coding Powerhouse

Released on June 20, 2024

Surpassed GPT-4o and Gemini 1.5 Pro at launch

2x faster than Claude 3 Opus at lower cost

DeepSeek AIcoding model236B MoEOpen Source

DeepSeek Coder V2: The Open-Source GPT-4 Turbo Rival

Released on June 17, 2024

First open MoE code model matching GPT-4 Turbo on coding

338 programming languages supported

NVIDIAopen source340B

NVIDIA Unveils Nemotron-4 340B: The Open-Source Powerhouse for Synthetic Data

Released on June 14, 2024

NVIDIA's open model for synthetic data generation

Permissive enterprise license

Alibaba Cloudopen source72B

Qwen2 Release: The 72B Open-Source Challenger to Llama 3

Released on June 7, 2024

Major upgrade, 0.5B to 72B range

Competitive with Llama 3 70B

Apache 2.0 license

Zhipu AIopen source9B

GLM-4 by Zhipu AI: 9B Parameter Open-Source Powerhouse

Released on June 5, 2024

128K context, 26 languages

Competitive with Llama 3 8B

Open-source GLM-4 series

Mistral AIcoding model22BOpen Source

Codestral by Mistral AI: The Open-Source Code Model Revolution

Released on May 29, 2024

Specialized code model, 80+ languages

32K context, fill-in-the-middle support

ByteDancelanguage modelOpen Source

Doubao Seed 1.5: ByteDance's Open-Source LLM Powerhouse

Released on May 15, 2024

ByteDance's flagship LLM, most popular AI product in China

Available via Doubao app and Volcano Engine API

Supports 50+ application scenarios including voice, vision, and coding

Open-source Seed 1.5 variants released under permissive license

OpenAImultimodalClosedMilestone

OpenAI GPT-4o: The Multimodal AI Milestone

Released on May 13, 2024

'Omni' model with native audio/vision/text

2x faster, 50% cheaper than GPT-4 Turbo

Real-time voice conversation capabilities

DeepSeek AIopen source236B MoE (21B active)

DeepSeek V2 Release: 236B MoE Open Source Power Unleashed

Released on May 7, 2024

236B MoE with only 21B active parameters

Multi-head Latent Attention for efficiency

Open weights

Snowflakeopen source480B MoE (17B active)

Snowflake Arctic: The Enterprise-Grade Open-Source LLM for SQL & Code

Released on April 24, 2024

480B MoE with 17B active parameters

Enterprise-focused, strong on SQL and coding

Apache 2.0 license

Microsoftopen source14B

Phi-3 Release: Microsoft's 14B Open-Source Powerhouse

Released on April 23, 2024

Mini/Small/Medium variants

Phi-3 Mini (3.8B) rivals Mixtral 8x7B

Phone-capable AI

Meta AIopen source70BMilestone

Meta Unveils Llama 3: The New Open-Source Standard

Released on April 18, 2024

Trained on 15T tokens, 8B and 70B sizes

New open-source SOTA with massive community adoption

Mistral AIopen source176B MoE

Mixtral 8x22B: Mistral AI's 176B Open-Source Mixture of Experts Model Delivers Enterprise-Level Performance

Released on April 17, 2024

Large MoE with strong multilingual and code performance

Open weights

Coherelanguage model104BOpen Source

Cohere's Command R+ 104B: Enterprise RAG Powerhouse with 128K Context

Released on April 4, 2024

Optimized for RAG and enterprise

128K context, 10 languages

Grounded generation capabilities

AI21 Labsopen source52B

Jamba 52B: AI21's Revolutionary Open-Source Mamba-Transformer Hybrid Model

Released on March 28, 2024

First production Mamba-Transformer hybrid

256K context, novel SSM architecture

Databricksopen source132B MoE (36B active)

DBRX 132B MoE: Databricks' Open-Source AI Challenger Surpasses Llama 2 70B

Released on March 27, 2024

Open MoE with 36B active parameters

Outperformed Llama 2 70B and Mixtral

Apache 2.0 license

xAIopen source314B MoE

Grok-1 Released: xAI's 314B Parameter Open-Source Model Breaks New Ground

Released on March 17, 2024

xAI's first open-source model

314B MoE under Apache 2.0

Largest open MoE at time of release

Anthropiclanguage modelClosedMilestone

Claude 3 by Anthropic: The Game-Changing Language Model That Rivals GPT-4

Released on March 4, 2024

Haiku/Sonnet/Opus family

Opus matched GPT-4 on most benchmarks

200K context window, vision capabilities

AnthropicreasoningClosedMilestone

Claude Opus 3: Anthropic's Milestone Reasoning Model Breaks New Ground

Released on March 4, 2024

First Claude Opus model with advanced reasoning

200K context window

Pioneered extended thinking capabilities

Vision and tool use support

Mistral AIlanguage modelClosed

Mistral Large: Mistral AI's Flagship Commercial Model Breaks New Ground

Released on February 26, 2024

Mistral's first flagship commercial model

32K context, top-tier reasoning

Google DeepMindopen source7B

Google DeepMind's Gemma: The Open-Source AI Revolution Starts with 7B Parameters

Released on February 21, 2024

Google's open-source model from Gemini research

2B and 7B sizes, strong for its class

Google DeepMindmultimodalClosedMilestone

Gemini 1.5 Pro: Google DeepMind's Revolutionary 1M Token Multimodal AI Breakthrough

Released on February 15, 2024

1 million token context window — 10x previous record

MoE architecture, processes entire codebases

Google DeepMindmultimodalClosed

Gemini 1.0 Ultra: Google's Most Capable Multimodal AI Model

Released on February 8, 2024

Most capable Gemini 1.0 model

Beat GPT-4 on 30/32 benchmarks

Powers Gemini Advanced

Stability AIopen source1.6B / 12B

StableLM 2: Stability AI's New Open-Source LLMs Challenge Industry Giants

Released on February 6, 2024

Open language model in two sizes: 1.6B and 12B

Trained on 2T tokens (Falcon RefinedWeb, RedPajama, The Pile, CulturaX)

Competitive with Mistral-7B despite smaller footprint

Stability AI Community License

BigCode / ServiceNowcoding model3B / 7B / 15BOpen Source

StarCoder 2: Revolutionary Open-Source Code Generation Models with 3B, 7B, and 15B Parameters

Released on February 6, 2024

Open code LLM in 3 sizes: 3B, 7B, 15B — trained on 4T+ tokens from The Stack v2

600+ programming languages, fill-in-the-middle capability

16K context with sliding window attention

Trained on permissively licensed code only

2023

Upstageopen source10.7B

SOLAR 10.7B: Upstage's Revolutionary Open-Source Model Dominates HuggingFace Leaderboards

Released on December 13, 2023

Korean startup Upstage's open model using depth up-scaling

Topped HuggingFace Open LLM Leaderboard at release

Apache 2.0 license

Mistral AIopen source46.7B MoE (12.9B active)Milestone

Mixtral 8x7B: The Open-Source Mixture of Experts Revolution That Matches GPT-3.5

Released on December 11, 2023

Open-source MoE matching GPT-3.5 quality with only 12.9B active params

Game-changer for open-source efficiency

Apache 2.0 license

Google DeepMindmultimodalClosedMilestone

Gemini 1.0: Google DeepMind's Revolutionary Multimodal AI Model

Released on December 6, 2023

Google's multimodal model family (Nano/Pro/Ultra)

Natively multimodal from training

NousResearchopen source34B

Nous Hermes 2: The Open-Source LLM That's Revolutionizing Local AI Deployment

Released on November 13, 2023

Community fine-tuned model on Mistral/Yi

Strong at instruction following

Popular for local AI

01.AIopen source34B

Yi 34B: The Bilingual Open-Source LLM That's Outperforming Llama 2 70B

Released on November 2, 2023

Founded by Kai-Fu Lee

Strong bilingual (English/Chinese) model

Competitive with Llama 2 70B

Zhipu AIopen source6B

ChatGLM3-6B: Zhipu AI's Third-Generation Open-Source Model with Advanced Agent Capabilities

Released on October 27, 2023

Third gen GLM with function calling, code interpreter, and agent capabilities

HuggingFaceopen source7B

Zephyr 7B: HuggingFace's Game-Changing Open-Source Model Built on Mistral

Released on October 25, 2023

Mistral 7B fine-tuned with DPO

Showed distilled alignment can match RLHF quality

Mistral AIopen source7BMilestone

Mistral 7B: The Open-Source AI Model That Redefined Performance Expectations

Released on September 27, 2023

Outperformed Llama 2 70B on all benchmarks despite being smaller

Sliding window attention

Apache 2.0 license

Alibaba Cloudopen source72B

Qwen 72B: Alibaba's Open-Source Giant Challenges AI Leaders with Multilingual Powerhouse

Released on September 25, 2023

Alibaba's multilingual model series

Strong on Chinese and English tasks

Open weights

WizardLM Teamcoding model34BOpen Source

WizardCoder 34B: Revolutionary Open-Source Coding Model Surpasses GPT-3.5 Performance

Released on August 26, 2023

Evol-Instruct tuned Code Llama

Top open-source coding model of its era

Strong on HumanEval

Meta AIcoding model34BOpen Source

Code Llama 34B: Meta's Specialized Coding Model Revolutionizes AI-Assisted Development

Released on August 24, 2023

Specialized Llama 2 for code generation

Supports Python, C++, Java, and more

100K context window

Meta AIopen source70BMilestone

Llama 2: How Meta's Open-Source Milestone Revolutionized AI Development

Released on July 18, 2023

First truly open-weight large model for commercial use

7B/13B/70B sizes with RLHF-tuned chat variants

Founded the modern open LLM ecosystem

Anthropiclanguage modelClosed

Claude 2 Review: Anthropic's Breakthrough Language Model with Constitutional AI

Released on July 11, 2023

200K context window

Constitutional AI approach

Strong coding and analysis capabilities

Zhipu AIopen source6B

ChatGLM2: Zhipu AI's 6B Parameter Powerhouse Delivers 42% Faster Inference

Released on June 25, 2023

Second generation GLM, 32K context

42% faster inference

Stronger math and coding

TIIopen source180B

Falcon 180B: The New Open-Source Giant That's Redefining AI Performance

Released on May 25, 2023

Trained on 3.5T tokens of RefinedWeb

Topped the Open LLM Leaderboard

Apache 2.0 license

Googlelanguage model340BClosed

Google PaLM 2: The 340B Parameter Language Model Revolutionizing AI

Released on May 10, 2023

Google's next-gen model powering Bard/Gemini

Improved multilingual, reasoning, and coding

MosaicMLopen source7B

MPT-7B: The Open-Source Transformer Revolution from MosaicML

Released on May 5, 2023

Commercially usable open-source model

Trained on 1T tokens

Apache 2.0 license

BigCode / HuggingFacecoding model15.5BOpen Source

StarCoder 15.5B: The Open-Source Code Generation Revolution by BigCode

Released on May 4, 2023

Open-source code LLM trained on The Stack (1T tokens, 80+ languages)

8K context window

Stability AIopen source7B

StableLM 7B: Stability AI's Open-Source Language Model Revolution

Released on April 19, 2023

Stability AI's open-source LLM family

3B and 7B sizes, trained on 1.5T tokens

CC-BY-SA license

LMSYSopen source13B

Vicuna 13B: The Open-Source Chatbot That Achieves 90% ChatGPT Quality

Released on March 30, 2023

Fine-tuned LLaMA on ShareGPT conversations

Achieved ~90% of ChatGPT quality

Launched the Chatbot Arena

Anthropiclanguage modelClosed

Claude 1 Review: Anthropic's First Public Language Model Breaks New Ground in AI Safety

Released on March 14, 2023

Anthropic's first public model

Constitutional AI for safety

100K context window

OpenAImultimodal~1.8T (MoE)ClosedMilestone

GPT-4: OpenAI's Revolutionary Multimodal AI That Changed Everything

Released on March 14, 2023

Multimodal (text + vision), passed the bar exam (90th percentile)

Massive leap in reasoning over GPT-3.5

~1.8T parameters (MoE estimated)

Stanfordopen source7B

Stanford's Alpaca 7B: How a $600 Fine-Tune Achieved GPT-3.5-Level Performance

Released on March 13, 2023

Fine-tuned LLaMA on 52K instructions generated by GPT-3.5

Showed cheap instruction tuning works

Meta AIopen source65BMilestone

LLaMA 1: The Revolutionary Open-Source Model That Changed AI Forever

Released on February 24, 2023

Leaked weights ignited the open-source LLM revolution

Showed small models can match GPT-3

65B parameters

2022

OpenAIlanguage model175BClosedMilestone

ChatGPT: The Revolutionary Language Model That Ignited the AI Era

Released on November 30, 2022

GPT-3.5 with RLHF in a chat interface

Reached 100M users in 2 months

Defined the AI era

Googlelanguage model11BOpen Source

Flan-T5: Google's Instruction-Tuned T5 Model Revolutionizes Few-Shot Learning

Released on October 20, 2022

Instruction-tuned T5

Demonstrated instruction tuning dramatically improves task generalization

BigScienceopen source176BMilestone

BLOOM: The 176B Parameter Revolution That Democratized AI in 2022

Released on July 6, 2022

First 100B+ open-source multilingual model

Built by 1000+ researchers across 70+ countries

46 languages supported

Meta AIopen source175B

Meta's OPT Model: The Open-Source Alternative to GPT-3 That Changed AI Research

Released on May 3, 2022

Meta's open-source GPT-3 equivalent

Full model weights released for research

175B parameters

EleutherAIopen source20B

GPT-NeoX: EleutherAI's 20B Breakthrough That Changed Open-Source LLMs Forever

Released on April 14, 2022

EleutherAI's 20B open model

First glimpse that local LLMs could scale to GPT-3 territory

Predecessor to today open-source ecosystem

Googlelanguage model540BClosed

Google's PaLM: The 540B Parameter Language Model That Changed Everything

Released on April 4, 2022

540B parameter model

Breakthrough capabilities in reasoning, code, and multilingual tasks

Google DeepMindlanguage model70BClosedMilestone

Chinchilla: How Google DeepMind Revolutionized LLM Scaling Laws in 2022

Released on March 29, 2022

Proved smaller models trained on more data outperform larger undertrained ones

Redefined scaling laws for LLMs

OpenAIlanguage model175BClosedMilestone

InstructGPT: The Revolutionary Language Model That Changed AI Alignment Forever

Released on January 27, 2022

Introduced RLHF for alignment

Pioneered training models to follow human instructions safely

2021

Google DeepMindlanguage model280BClosed

Gopher: Google DeepMind's 280B Parameter Breakthrough That Changed NLP Forever

Released on December 8, 2021

280B parameter model

Extensive analysis of scaling laws across 152 tasks

OpenAIcoding model12BClosedMilestone

OpenAI Codex: The Revolutionary Coding Model That Changed Everything

Released on August 10, 2021

GPT-3 fine-tuned on code

Powered GitHub Copilot

Proved LLMs could write functional programs

EleutherAIopen source6B

GPT-J: The Game-Changing 6B Parameter Open-Source Model That Democratized Large Language Models

Released on June 9, 2021

First open model runnable on consumer hardware

6B params, GPT-2 architecture

Widely deployed in early local AI applications

Googlelanguage model1571BOpen Source

Google's Switch Transformer: The 1.6 Trillion Parameter Breakthrough That Changed AI Scaling Forever

Released on January 11, 2021

1.6 trillion parameter MoE model

Demonstrated efficient scaling through sparse expert routing

2020

Googlelanguage model600B MoEClosed

GShard: Google's Revolutionary 600B Parameter Mixture of Experts Language Model

Released on June 30, 2020

First Mixture of Experts model at massive scale

600B parameters for machine translation

OpenAIlanguage model175BClosedMilestone

GPT-3: The 175-Billion Parameter Revolution That Changed AI Forever

Released on May 28, 2020

175B parameters — demonstrated few-shot learning without fine-tuning

Sparked the modern LLM revolution

2019

Googlelanguage model11BOpen SourceMilestone

T5: How Google's Text-to-Text Transformer Revolutionized NLP Architecture

Released on October 23, 2019

Text-to-Text Transfer Transformer

Unified framework treating all NLP tasks as text generation

Meta AIlanguage model355BOpen Source

RoBERTa: Meta AI's Revolutionary BERT Optimization Breakthrough

Released on July 26, 2019

Robustly Optimized BERT

Showed BERT was significantly undertrained

Achieved new SOTA with better training

Google / CMUlanguage model340BOpen Source

XLNet: The Revolutionary Autoregressive Language Model That Surpassed BERT

Released on June 19, 2019

Generalized autoregressive pretraining

Outperformed BERT on 20 NLP tasks

OpenAIlanguage model1.5BOpen SourceMilestone

GPT-2: The Language Model That Changed Everything in AI History

Released on February 14, 2019

Initially withheld due to misuse concerns — "Too dangerous to release"

Showed emergent text generation quality at scale

2018

Googlelanguage model340BOpen SourceMilestone

BERT: The Revolutionary Language Model That Changed NLP Forever

Released on October 11, 2018

Bidirectional Encoder Representations from Transformers

Revolutionized NLP benchmarks

Became the foundation for search engines

OpenAIlanguage model117BOpen Source

GPT-1: The Revolutionary Foundation That Started the Modern LLM Era

Released on June 11, 2018

First GPT model — decoder-only transformer

Demonstrated generative pre-training for language understanding

Allen AIlanguage model94MOpen Source

ELMo: The Groundbreaking Contextual Language Model That Changed NLP Forever

Released on February 15, 2018

Embeddings from Language Models

Contextualized word representations using bidirectional LSTMs

2017

Googlelanguage modelOpen SourceMilestone

Transformer: The Revolutionary Architecture That Changed AI Forever

Released on June 12, 2017

'Attention Is All You Need' paper introduces the Transformer architecture

The foundation of all modern LLMs