AI Model Releases Timeline
A chronological timeline of major AI model releases
2026
Mistral Medium 3.5: The 128B Open-Source Flagship for 2026
Released on April 29, 2026
New flagship model merging instruction-following, reasoning, and coding into a single 128B dense architecture
Released as open weights under a modified MIT license
Runs self-hosted on as few as four GPUs
API pricing at $1.50/mtok input and $7.50/mtok output
Powers the new Mistral Vibe remote agents for async cloud coding sessions
Drives Work mode in Le Chat for multi-step agentic task execution with parallel tool calling
Sessions can be spawned from CLI or Le Chat, and local CLI sessions can be teleported to the cloud
NVIDIA Nemotron 3 Nano Omni: Open Multimodal AI Release
Released on April 28, 2026
Multimodal model unifying video, audio, image, and text understanding in a single architecture
Hybrid Mixture-of-Experts (MoE) 30B-A3B architecture with 30B total and 3B active parameters
Up to 9x higher throughput compared to similar open omnimodal models
256K unified context window with single-pass perception
Hybrid architecture combining Mamba layers for memory efficiency and transformers for precise reasoning
Integrates vision encoders (C3D for video) and audio encoders (Paraquet), eliminating need for separate models
Supports FP8/NVFP4 quantization with optimized inference on NVIDIA Ampere, Hopper, and Blackwell GPUs
Designed for enterprise multimodal agents: document intelligence (OCR, tables), GUI navigation, audio-video reasoning
Runs locally with 25-36GB RAM in 4/8-bit quantization via Unsloth or vLLM
Available on Hugging Face, Ollama, OpenRouter, and NVIDIA NIM
Poolside Laguna-M.1: The 225B Coding Giant Arrives in 2026
Released on April 28, 2026
225B total parameter Mixture-of-Experts model with 23B activated parameters per token
Poolside most capable model to date, completed pre-training at end of 2025
Trained from scratch on 30T tokens using Muon optimizer
Trained on 6,144 interconnected NVIDIA Hopper GPUs entirely in-house
Achieves 72.5% on SWE-bench Verified, 67.3% on SWE-bench Multilingual, 46.9% on SWE-bench Pro, 40.7% on Terminal-Bench 2.0
128K context window with up to 8K output tokens
Agentic coding model built for long-horizon software engineering tasks
Foundation for the entire Laguna model family
Uses custom async on-policy RL system with Agent Client Protocol (ACP) server
Free to use for a limited time via poolside API and OpenRouter
Weights available on request for startups, institutions, and universities
Laguna-XS.2: Poolside's Open-Source Coding Model Release
Released on April 28, 2026
33B total parameter Mixture-of-Experts model with 3B activated parameters per token
First open-weight release from poolside, licensed under Apache 2.0
Trained on 30T tokens using Muon optimizer
Supports native reasoning with interleaved thinking between tool calls
Uses Sliding Window Attention with per-head gating in 30 of 40 layers
KV cache quantized to FP8 for reduced memory per token
Compact enough to run locally on a Mac with 36 GB RAM
128K context window with up to 8K output tokens
Achieves 68.2% on SWE-bench Verified, 62.4% on SWE-bench Multilingual, 44.5% on SWE-bench Pro, 30.1% on Terminal-Bench 2.0
Supports vLLM, Transformers, TRT-LLM, and Ollama
Agentic coding model built for long-horizon software engineering tasks
Free to use for a limited time via poolside API and OpenRouter
DeepSeek-V4 Release: Open Source & Pricing
Released on April 24, 2026
Deux modèles : DeepSeek-V4-Pro (1.6T total / 49B active params) et DeepSeek-V4-Flash (284B total / 13B active params)
Context length de 1M tokens, output max de 384K tokens
Support thinking mode (par défaut) et non-thinking mode
Pricing ultra-agressif : Flash à $0.14/M input tokens (cache miss), $0.028/M (cache hit), $0.28/M output — soit ~7x moins cher que Claude Opus 4.7
Pro à $1.74/M input tokens (cache miss), $0.145/M (cache hit), $3.48/M output
Modèles open-source, poids disponibles sur HuggingFace
Compatible format API OpenAI et Anthropic (https://api.deepseek.com et https://api.deepseek.com/anthropic)
Support JSON output, Tool Calls, Chat Prefix Completion (Beta), FIM Completion (Beta)
Performance rivalisant avec les meilleurs modèles closed-source mondiaux
OpenAI Launches GPT-5.5: The Frontier of Agentic Intelligence
Released on April 23, 2026
GPT-5.5 is OpenAI smartest and most intuitive to use model yet, described as the next step toward a new way of getting work done on a computer
Achieves 82.7% on Terminal-Bench 2.0, 73.1% on Expert-SWE (Internal), and 84.9% on GDPval — all state-of-the-art scores
Matches GPT-5.4 per-token latency while performing at a much higher level of intelligence
Significantly more token efficient — uses fewer tokens to complete the same tasks compared to GPT-5.4
Scores 78.7% on OSWorld-Verified for real computer environment operation and 81.8% on CyberGym
GPT-5.5 Pro achieves 90.1% on BrowseComp and 52.4% on FrontierMath Tier 1-3
On SWE-Bench Pro, reaches 58.6% solving more tasks end-to-end in a single pass than previous models
Proactively deployed with industry-leading cybersecurity safeguards, classified as High under OpenAI Preparedness Framework
Helped discover a new proof about Ramsey numbers in combinatorics, later verified in Lean
Scores 25.0% on GeneBench for multi-stage scientific data analysis in genetics
API pricing: $5/1M input tokens and $30/1M output tokens with 1M context window
GPT-5.5 Pro API pricing: $30/1M input tokens and $180/1M output tokens
Co-designed, trained with, and served on NVIDIA GB200 and GB300 NVL72 systems
Rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex
GPT-5.5 Thinking unlocks faster help for harder problems with smarter, more concise answers
Outperforms Claude Opus 4.7 and Gemini 3.1 Pro on most coding and professional benchmarks
More than 85% of OpenAI now uses Codex every week across all company functions
Xiaomi MiMo-V2.5-Pro: The 1T+ Parameter Agent Revolution
Released on April 22, 2026
Multimodal Mixture-of-Experts (MoE) architecture with 1T+ total parameters (42B active)
Extended context window up to 1M tokens
Native multimodal perception supporting text, images, video, and audio
Advanced autonomous agent capabilities handling 1000+ tool calls
40-60% better token efficiency compared to Claude Opus and GPT-5.x
ClawEval benchmark: 64% Pass@3 score
SWE-bench Pro: 57.2% task resolution rate
Surpasses Claude 4.6 Sonnet in coding tasks, approaches Claude Opus in agentic performance
Part of the MiMo-V2.5 family alongside MiMo-V2.5 and MiMo-V2.5-TTS
Available via mimo.mi.com with affordable token plans (monthly/annual subscriptions)
Qwen3.6-27B Release: 27B Dense Model Beats 397B on Coding
Released on April 22, 2026
27B dense open-source model with Apache 2.0 license
Surpasses Qwen3.5-397B-A17B on all major agentic coding benchmarks
SWE-bench Verified: 77.2 vs 76.2, Terminal-Bench 2.0: 59.3 vs 52.5, SkillsBench: 48.2 vs 30.0
Supports both multimodal thinking and non-thinking modes natively
Native vision-language support for images and video understanding
GPQA Diamond: 87.8, competitive with models several times its size
Compatible with OpenClaw, Claude Code, and Qwen Code coding assistants
Available on Hugging Face, ModelScope, and Alibaba Cloud Model Studio API
Moonshot AI Unveils Kimi K2.6: Open-Source SOTA for Autonomous Agents
Released on April 20, 2026
Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python (86.7), Math Vision w/ python (93.2)
Long-horizon coding: 4,000+ tool calls, over 12 hours continuous execution
Generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization)
300 parallel sub-agents x 4,000 steps per run (up from K2.5: 100 / 1,500)
Proactive Agents: powers OpenClaw, Hermes Agent for 24/7 autonomous ops
Claw Groups research preview: bring your own agents, command friends bots & humans in the loop
API PRICING (use these EXACT values): kimi-k2.6 — Input $0.16/M tokens (cache hit), Input $0.95/M tokens (cache miss), Output $4.00/M tokens, Context Window 262,144 tokens
Sources: https://platform.moonshot.ai, https://kimi.com/blog/kimi-k2-6, https://huggingface.co/moonshotai/Kimi-K2.6
Live on kimi.com in chat and agent mode, plus Kimi Code at https://kimi.com/code for production-grade coding
Anthropic Unveils Claude Opus 4.7: The New Reasoning Standard
Released on April 16, 2026
Most capable generally available Anthropic model for complex reasoning and agentic coding
High-resolution image support: 2576px / 3.75MP (up from 1568px / 1.15MP) with 1:1 pixel mapping
New "xhigh" effort level for coding and agentic use cases
Task budgets (beta) — advisory token budget across full agentic loops
128K max output tokens, 1M context window at standard pricing
+12 points on CursorBench coding benchmarks vs Opus 4.6
New tokenizer (up to ~35% more tokens per text, improved performance)
Adaptive thinking only — extended thinking budgets removed
Sampling parameters (temperature, top_p, top_k) removed
Pricing: $5/$25 per MTok input/output, batch $2.50/$12.50 per MTok
GLM-5.1: The Open-Source Reasoning King Arrives
Released on April 7, 2026
#1 on SWE-Bench Pro (58.4%), beating GPT-5.4 and Claude Opus 4.6
Post-training upgrade to GLM-5 — same 744B MoE architecture (40B active)
Trained entirely on Huawei Ascend chips — no NVIDIA hardware
MIT license, compatible with Claude Code and OpenClaw
202K context window, strong on cybersecurity (CyberGym 68.7%)
Anthropic Unveils Claude Opus 4.6 Fast: Speed Meets Smarts
Released on April 7, 2026
Faster variant of Claude Opus 4.6 with comparable intelligence
Anthropic Unveils Claude Mythos Preview: The Capybara Tier Breakthrough
Released on April 7, 2026
New Capybara tier above Opus — the most powerful Anthropic model
93.9% on SWE-bench Verified, 77.8% on SWE-bench Pro
97.6% on USAMO 2026, 94.5% on GPQA Diamond
1M context window, limited preview for ~50 partner organizations
Google Unveils Gemma 4: The Apache 2.0 Frontier
Released on April 2, 2026
Google's most capable open models, built from Gemini 3 research
Four sizes: E2B, E4B, 26B MoE (3.8B active), 31B Dense
First Gemma release under Apache 2.0 license
Native multimodal, 140+ languages, up to 256K context
Agent-ready with function calling and structured JSON output
GLM-5V Turbo Review: Zhipu's Multimodal Coding Powerhouse
Released on April 1, 2026
Vision + Code model from Z.ai
Multimodal coding capabilities
API only
Qwen 3.6 Plus Review: Speed, Code, and 1M Context
Released on March 31, 2026
1M token context window with always-on chain-of-thought reasoning
78.8% on SWE-bench Verified — competitive with Claude Opus 4.6
2-3x faster output speed than Claude Opus 4.6
Free preview via OpenRouter, successor to Qwen 3.5
Mistral Unleashes Voxtral TTS: The Open-Weight Voice AI That Challenges ElevenLabs
Released on March 23, 2026
Mistral's first audio model — direct competitor to ElevenLabs
Zero-shot voice cloning with multilingual support
Real-time streaming capabilities
Open weights under CC BY-NC 4.0 (non-commercial)
Xiaomi MiMo-V2-Pro: The 309B MoE Reasoning Powerhouse
Released on March 18, 2026
Xiaomi reasoning model with strong math and code performance
309B MoE architecture
MiniMax M2.7: The Self-Evolving Coding Model Revolution
Released on March 18, 2026
Self-evolving agent model — first to participate in its own development
56.22% on SWE-Pro, matching GPT-5.3-Codex
57.0% on Terminal Bench 2, GDPval-AA ELO 1495 (highest open-source)
230B MoE (10B active), 200K context, open weights on HuggingFace
Agent Teams for native multi-agent collaboration
OpenAI Releases GPT-5.4 Mini: Efficient AI for Developers
Released on March 17, 2026
Efficient variant of GPT-5.4 with native computer use
Lower cost while maintaining strong reasoning capabilities
Leanstral by Mistral AI: The Open-Source Proof Agent Revolutionizing Code Verification
Released on March 16, 2026
First open-source code agent for Lean 4 formal proof engineering
Generates code AND machine-checkable mathematical proofs
119B MoE with 6.5B active, outperforms Claude Sonnet 4.6 on FLTEval
Apache 2.0 license, 15x cheaper than Claude Opus for formal verification
Mistral Small 4: The Unified Frontier Model Released
Released on March 16, 2026
Unifies instruct, reasoning, coding, and multimodal in a single model
119B MoE with 6.5B active parameters, 256K context window
Replaces Magistral (reasoning), Pixtral (vision), and Devstral (coding)
Apache 2.0 license, configurable reasoning parameter
xAI Grok 4.20: The Parallel Agent Revolution
Released on March 12, 2026
Beta release with parallel agents architecture
500K context window
Iterative improvement via user feedback
NVIDIA Nemotron 3 Super: The Open MoE Powerhouse for Enterprise Agents
Released on March 11, 2026
Open MoE model from NVIDIA
120B total parameters with 12B active
Strong enterprise performance
OpenAI GPT-5.4 Series: The 1M Token Frontier Model
Released on March 6, 2026
Latest OpenAI flagship with 1M token context window
Available in Standard, Mini, and Nano variants
Supports reasoning effort with 4 effort levels
128K max output tokens
Prompt caching with $0.02-$0.25/M cached read
Gemini 3.1 Flash Lite Preview: The Speed-Cost King for Developers
Released on March 3, 2026
Google's high-efficiency model optimized for high-volume use cases
1M token context window, 65.5K max output
Supports prompt caching, reasoning effort, and reasoning budget
Native tool calling and vision capabilities
Gemini 3.1 Pro Release: Google's 2026 AI Flagship
Released on February 19, 2026
Google's latest flagship model
More than doubles reasoning performance over Gemini 3 Pro
Released in preview via Gemini API, AI Studio, and Vertex AI
xAI Grok 4.2: The 2026 AI Revolution for Developers
Released on February 17, 2026
Beta release with rapid learning architecture — improves weekly via user feedback
256K context window
4-agent parallel reasoning
Medical document analysis added
Anthropic Unveils Claude Sonnet 4.6: The Ultimate Developer Model
Released on February 17, 2026
Most capable Sonnet yet with full upgrade across coding, computer use, long-context reasoning
1M token context window in beta
200K token context window, 64K max output
Supports prompt caching, reasoning effort, and reasoning budget
Native tool calling and vision capabilities
Qwen 3.5 Release: 397B MoE Agentic Powerhouse Review
Released on February 14, 2026
Agentic AI model with built-in tools for web search and code execution
1M token context window
Qwen3.5-Plus hosted; open weights planned
MiniMax M2.5: The Open-Source Frontier Coding Model for 2026
Released on February 12, 2026
Frontier MoE model with 80.2% on SWE-Bench Verified
Strong coding and agentic capabilities
230B total parameters, 10B activated per token
DeepSeek V3.2: The 671B MoE Open Source King
Released on February 12, 2026
Major update to the V3 series with 1M token context
671B MoE focused on code generation and reasoning improvements
Open weights on HuggingFace, MIT license
GLM-5: Zhipu AI's Open-Source Reasoning Frontier
Released on February 11, 2026
China's first public AI company frontier model
Targets complex systems engineering and long-horizon agentic tasks
MiniCPM-o 4.5: 9B Multimodal AI Model Release
Released on February 8, 2026
On-device multimodal LLM with full-duplex real-time audio, image, video
Built on Qwen3-8B architecture
Gemini 2.5 Flash level performance at only 9B parameters
OpenAI Unveils GPT-5.3-Codex: The Ultimate Agentic Developer Assistant
Released on February 5, 2026
Most capable agentic coding model from OpenAI
Available via Codex app, CLI, IDE extensions
Optimized for software engineering workflows
Claude Opus 4.6: The Agentic Reasoning Breakthrough
Released on February 5, 2026
Huge leap for agentic planning with parallel subtask execution
Tool and subagent orchestration capabilities
Terminal-Bench record holder
1M token context window, 32K max output
State-of-the-art agentic AI behaviors
Step-3.5-Flash: The Open-Source Reasoning King
Released on February 1, 2026
Open-source sparse MoE with 3-way Multi-Token Prediction
100-350 tok/s generation speed
Frontier reasoning at low cost
Arcee AI Unveils Trinity Large: The 400B Open Source MoE Powerhouse
Released on January 27, 2026
400B sparse MoE with only 13B active parameters
Built in the US with open weights
One of the largest open-source foundation models
Apache 2.0 license
Qwen3-Max-Thinking: Alibaba's New Reasoning Powerhouse
Released on January 27, 2026
Top-tier reasoning model with adaptive tool use
Retrieves information and runs code during inference
Rivals leading frontier models
Moonshot AI Unveils Kimi K2: The 1T Parameter Open-Source Giant
Released on January 20, 2026
Massive 1T MoE with 32B active parameters
First open-weight model to rank #1 on LMSYS Chatbot Arena
2M token context window, 200+ language support
$0.15/$2.50 per 1M tokens, Modified MIT license
Sarvam-2B: India's Lightweight Sovereign LLM for Edge Deployment
Released on January 15, 2026
India's multilingual LLM — part of sovereign AI initiative
Supports 10+ Indian languages natively
2025
Upstage Unveils SOLAR 102B: Korea's Open Frontier Model
Released on December 31, 2025
Korea's answer to open frontier models
102B MoE model with 12B active parameters
Gemini 3 Flash: The Speed Revolution from Google DeepMind
Released on December 17, 2025
Fast frontier-class model rivaling larger models at a fraction of the cost
Default model in the Gemini app
Molmo 2: The Open-Source Multimodal Revolution from Allen AI
Released on December 16, 2025
Multimodal model from AI2
Fully open weights, data, and code
Xiaomi MiMo V2 Flash: The Open-Source Reasoning Powerhouse
Released on December 16, 2025
Xiaomi large reasoning model
309B MoE architecture
Strong on math and code
OpenAI GPT-5.2 Release: A Milestone in AI Reasoning and Multimodal Power
Released on December 11, 2025
Improved reasoning and multimodal capabilities over GPT-5.1
Enhanced mental health protections
128K max output tokens
Available on Plus ($20/month), Pro ($200/month), and API
Expert-level performance on 44 knowledge work tasks
Devstral Small 2 Release: Mistral AI's New 24B Coding Powerhouse
Released on December 9, 2025
Successor to Devstral Small 1, derived from Mistral Small 3.1
Portable coding agent
Apache 2.0 license
Mistral AI Unveils Devstral 2: The 123B Coding Giant
Released on December 9, 2025
Next-gen coding model with top SWE-Bench score
Modified MIT license (free unless high revenue)
Mistral AI Unveils Ministral 3 14B: The Multimodal Powerhouse
Released on December 2, 2025
Largest Ministral 3 model with vision
Best-in-class text and vision capabilities
Apache 2.0 license
Ministral 3 8B: The Open-Source Vision Frontier
Released on December 2, 2025
Powerful and efficient model with vision
Best-in-class text and vision at this size
Apache 2.0 license
Ministral 3 3B: The Edge AI Revolution Arrives
Released on December 2, 2025
Tiny and efficient edge model with vision
Runs on phones, drones, and laptops
Apache 2.0 license
Amazon Nova 2: The Next-Gen LLM for AWS Bedrock
Released on December 2, 2025
Amazon next-gen foundation model
Available via AWS Bedrock
Announced at re:Invent
Mistral Large 3: The Open-Weight Frontier Model for 2025
Released on December 2, 2025
Sparse MoE with 41B active parameters
Open weights
Strong reasoning and multilingual capabilities
GLM-4.7 Release: Zhipu AI Open-Source Coding Model
Released on December 1, 2025
Open-weights model topping global coding and reasoning leaderboards
Includes GLM-4.7 Flash variant
Cost-effective compared to Western competitors
MiniMax M2.1: The 230B Open Source Coding Model That Disrupts Pricing
Released on December 1, 2025
Fully open-source SOTA coding model
230B params MoE architecture, 10B activated per token
SWE-bench score of 74.0%
92% cheaper than Western alternatives
Anthropic Unveils Claude Opus 4.5: The Reasoning Milestone
Released on November 24, 2025
Exceeds Sonnet 4.5 by 4.3% using 48% fewer tokens at max effort
200K token context, 64K max output
Hybrid reasoning with instant or extended thinking
Multimodal: text, image, and audio support
20% accuracy gain, Excel and financial modeling breakthrough
Allen AI Unveils OLMo 3: The 32B Open-Source Powerhouse
Released on November 20, 2025
Fully open model with weights, data, and training code
From AI2 research lab
Deep Cogito Releases Cogito v2.1: 671B MoE Reasoning Powerhouse
Released on November 19, 2025
Large 671B MoE reasoning model
Strong on complex reasoning tasks
Gemini 3 Deep Think: Google's New Reasoning Powerhouse
Released on November 18, 2025
Reasoning variant of Gemini 3
Deep chain-of-thought for complex scientific problems
Google DeepMind Unveils Gemini 3 Pro: The Multimodal Leap of 2025
Released on November 18, 2025
Over 50% improvement over Gemini 2.5 Pro
Most powerful Google model — replaces 2.5 series
1M token context window
Advanced multimodal: text, image, video, audio, code
OpenAI Unveils GPT-5.1: Smarter, Faster, and More Conversational
Released on November 12, 2025
Family of four models with adaptive reasoning
Faster, more conversational, improved coding
Rolled out to all ChatGPT users
Moonshot AI Unveils Kimi K2.5: The New Reasoning King
Released on November 6, 2025
Upgraded Kimi model with thinking and reasoning capabilities
Amazon Nova Premier: The 1M Context Multimodal Powerhouse
Released on October 31, 2025
Most capable Amazon model
1M context window
Multimodal capabilities
Teacher for distillation on Bedrock
Yandex Alice AI 1.0: The Global LLM Breakthrough
Released on October 28, 2025
First major Russian-developed large language model on the global stage
From Yandex
MiniMax M2: 230B Open-Source Model Released 2025
Released on October 23, 2025
Upgraded MiniMax model with improved reasoning and generation
Open weights
GLM-4.6 Release: Zhipu AI's Domestic Chip Powerhouse
Released on October 9, 2025
First GLM model with native support for China domestic chips
Cambricon and Moore Threads support
FP8 and Int4 quantization
IBM Granite 4.0: The Hybrid Mamba-2 Revolution for Enterprise AI
Released on October 2, 2025
IBM open enterprise model
Hybrid Mamba-2 Transformer architecture
Apache 2.0 license
Anthropic Unveils Claude Haiku 4.5: The Speed King Arrives
Released on October 1, 2025
Anthropic's fastest model with near-frontier intelligence
200K token context window, 64K max output
21K+ tokens per second for prompts under 32K tokens
Supports reasoning budget and effort control
Most cost-effective in the Claude family: $1/M input
DeepSeek V3.2: The Open-Source Challenger to GPT-5
Released on September 29, 2025
Further iteration on V3 series
Enhanced capabilities across all benchmarks
Open weights
Anthropic Unveils Claude Sonnet 4.5: The New King of Code
Released on September 29, 2025
Anthropic's best model for coding tasks
1M token context window (beta feature)
64K max output tokens
Strong agentic behavior and computer-use skills
Optimized for efficient coding and parallel processing
Qwen3-Next: The 80B MoE Revolution for Local Deployment
Released on September 10, 2025
Ultra-efficient MoE from Alibaba
80B total, only 3B active parameters
Strong reasoning with minimal compute
Apache 2.0 license
Moonshot AI Releases Kimi K2: A 1T MoE Open-Source Breakthrough
Released on September 4, 2025
Massive 1T MoE model with open weights
Highly competitive with frontier models
Major Chinese AI milestone
32B activated parameters
Cost-effective: ~$0.15/M input, $2.50/M output
Strong coding performance across 32+ languages
xAI Grok 4 Fast Release: 98% Cheaper, Faster AI
Released on September 1, 2025
98% cost reduction compared to Grok 4 Standard
40% increase in token efficiency
Real-time search integration via X
$0.20/M input, $1.50/M output
Magistral Medium 1.2: Vision-Enhanced Reasoning Power Unveiled
Released on September 1, 2025
Adds vision to Magistral Medium
Multimodal frontier reasoning
Closed API only
Mistral AI Unveils Magistral Small 1.2: The Multimodal Reasoning Powerhouse
Released on September 1, 2025
Adds vision to Magistral Small
Multimodal reasoning model
Apache 2.0 license
Hermes 4 405B: The New Open-Source Reasoning Giant
Released on August 28, 2025
Latest in the Hermes series
Advanced function calling and structured output
Built on Llama 3.1
DeepSeek V3.1: The Open-Source GPT-5 Rival Released
Released on August 21, 2025
Major upgrade to V3 with improved reasoning and coding
Open weights
Mistral Medium 3.1 Release: Frontier Multimodal AI Breakthrough
Released on August 12, 2025
Frontier-class multimodal model
Competitive with GPT-4o and Claude 3.5
Strong vision and reasoning capabilities
Zhipu AI Unveils GLM-4.5V: 106B Open Source Multimodal Powerhouse
Released on August 11, 2025
Vision-language model from Z.ai
106B parameters with strong multimodal understanding
OpenAI GPT-5 Release: The 2025 AI Revolution
Released on August 7, 2025
Next-generation flagship with major intelligence leap
400K token context window
Built-in reasoning with 4 effort levels
Multimodal: text, image, and video-based reasoning
Available in Standard, Mini, and Nano variants
OpenAI Releases GPT-OSS: The Historic Open-Weight Shift
Released on August 5, 2025
OpenAI's first open-weight models since GPT-2
20B and 120B variants
Historic open-source move from OpenAI
Anthropic Unveils Claude Opus 4.1: The Reasoning King
Released on August 5, 2025
Upgrade to Claude 4 with improved coding and instruction following
200K token context window
Extended thinking support
Vision and tool calling capabilities
Anthropic Unveils Claude 4.5 Sonnet: The New Coding Powerhouse
Released on July 29, 2025
Newest Anthropic model with improved creative writing
Enhanced nuance and multi-step reasoning
Zhipu AI Unveils GLM-4.5 Air: The Efficient 106B MoE Powerhouse
Released on July 28, 2025
Lightweight variant of GLM-4.5
106B MoE, efficient inference on 8x H20 GPUs
Zhipu AI Unveils GLM-4.5: The 355B MoE Open Source Powerhouse
Released on July 28, 2025
Z.ai flagship open MoE model
355B total parameters
Strong reasoning, coding, and agentic capabilities
Claimed cheaper to run than DeepSeek
xAI Grok 4 Release: The 2025 Reasoning Milestone
Released on July 11, 2025
xAI's most powerful model at the time
Major reasoning leap
Trained on expanded Colossus cluster
Google Unveils Gemma 3n: The 4B Edge AI Breakthrough
Released on June 26, 2025
Efficient on-device model designed for mobile
Runs on phones and edge devices
OpenAI Unveils GPT-o3 Pro: The New Reasoning King
Released on June 10, 2025
Most powerful OpenAI reasoning model
Extended thinking for frontier problems
Mistral Small 3.2: The 24B Open-Source Reasoning King
Released on June 10, 2025
Update to Mistral Small 3.1
Improved instruction following and reasoning
Apache 2.0 license
Xiaohongshu Releases dots.llm1: 142B MoE Open Source Breakthrough
Released on June 6, 2025
Open-source MoE from RedNote (China Instagram)
142B total, 14B active
Performance on par with frontier models at time of release
Mistral AI Unveils Magistral Small: 24B Reasoning Powerhouse
Released on June 5, 2025
Mistral reasoning model with extended thinking
Strong STEM performance
Apache 2.0 license
Gemini 2.5 Pro (06-05): The New Frontier for Agentic AI
Released on June 5, 2025
Latest 2.5 Pro with enhanced coding, reasoning, and agentic capabilities
MiniMax-M1: The Open-Source Hybrid Attention Breakthrough
Released on June 1, 2025
Chinese AI lab flagship with strong long-context
Lightning attention architecture
Anthropic Unveils Claude Sonnet 4: The New Coding Powerhouse
Released on May 22, 2025
High-performance model balancing speed and intelligence
200K context window, 64K max output
Best model for complex agents and coding
Native tool calling and computer use
Available on free tier of Claude.ai
Anthropic Unveils Claude Opus 4: The New Reasoning King
Released on May 22, 2025
Most powerful Anthropic model at launch
Parallel tool use, long autonomous tasks
200K token context window
Extended thinking support
Vision capabilities for image understanding
Mistral Devstral: The New 24B Coding Model for AI Engineers
Released on May 21, 2025
Mistral dedicated coding model
Optimized for software engineering and agentic coding tasks
Apache 2.0 license
Falcon H1: TII's Hybrid SSM Powerhouse Release
Released on May 20, 2025
Hybrid SSM+attention architecture
Six model sizes from 0.5B to 34B
Punches above weight class on benchmarks
Apache 2.0 license
Gemini 2.5 Flash: Google's Speed King Arrives with Controllable Reasoning
Released on May 20, 2025
Cost-efficient reasoning with controllable thinking depth
#1 Chatbot Arena for speed
Mistral Medium 3: The Open-Source GPT-4o Challenger
Released on May 14, 2025
Front-tier model, competitive with GPT-4o
Strong multilingual capabilities
Apache 2.0 license
Qwen 3 Release: The 235B MoE Powerhouse for Developers
Released on April 29, 2025
Excellent multilingual performance (Chinese, English, and more)
0.6B to 235B variants with hybrid thinking
119 languages supported
22B active parameters in MoE architecture
Strong coding performance
Apache 2.0 license
Zhipu GLM-4.1V: Open-Source Multimodal Reasoning Powerhouse
Released on April 25, 2025
Open 32B and 9B multimodal with reasoning
Competitive on vision tasks
OpenAI o4-mini: The New King of Efficient Reasoning for Developers
Released on April 16, 2025
Efficient reasoning model
Best cost-performance for coding and STEM
OpenAI o3: The Ultimate Reasoning Model Released for Developers
Released on April 16, 2025
Full o3 reasoning model — successor to o1
Deep chain-of-thought capabilities
OpenAI Unveils GPT-4.1 Series: 1M Context & Coding Power
Released on April 14, 2025
Optimized for coding and instruction following
1M token context window
Available in Standard, Mini, and Nano variants
Nano: $0.10/M input, $0.40/M output
Meta Unveils Llama 4: The Open-Source Multimodal Milestone
Released on April 5, 2025
Open-weight natively multimodal models
Scout: 109B, runs on single H100 GPU, 10M token context
Maverick: 400B, requires H100 DGX system
Early fusion for native text, image, and video understanding
Google DeepMind Unveils Gemini 2.5 Pro: The Multimodal Reasoning Benchmark
Released on March 25, 2025
#1 on LMArena at launch
Built-in reasoning capabilities
1M token context window
Native code execution and Google Search grounding
Best overall model at launch
NVIDIA Unveils Nemotron Ultra: The 253B MoE Reasoning Powerhouse
Released on March 18, 2025
Open reasoning model based on Llama
253B MoE architecture
Strong enterprise tasks
Mistral Small 3.1: The Multimodal Leap in Open-Source AI
Released on March 17, 2025
Adds vision capabilities to Small 3.0
Multimodal, 128K context
Apache 2.0 license
Command A: Cohere's 111B Open Source Enterprise Model
Released on March 13, 2025
Cohere's 111B flagship model
Enterprise RAG and agentic tasks
Multilingual capabilities
Runs on 2 GPUs
Gemma 3 Release: Google DeepMind's Multimodal Open-Weight Revolution
Released on March 12, 2025
1B/4B/12B/27B variants
Multimodal (text+vision)
Single GPU capable, 128K context
InternLM 3 8B Release: Deep Thinking & Apache 2.0
Released on March 5, 2025
8B bilingual (English + Chinese) model with deep thinking mode
Surpasses Llama 3.1 8B and Qwen2.5 7B on reasoning/knowledge tasks
128K context, trained on 4T tokens with 75%+ cost savings
Apache 2.0 license
QwQ-32B: Alibaba Cloud's New Open-Source Reasoning Powerhouse
Released on March 5, 2025
Dedicated reasoning model from Qwen team
Strong mathematical and logical reasoning
Apache 2.0 license
OpenAI GPT-4.5 Release: The EQ-Driven Leap in AI Reasoning
Released on February 27, 2025
Largest OpenAI model at the time
Focus on EQ, creativity, reduced hallucinations
Anthropic Unveils Claude 3.7 Sonnet: The New Coding Powerhouse
Released on February 24, 2025
Hybrid reasoning — toggle instant/extended thinking
Best coding model at launch
200K context window, 64K max output
Microsoft Phi-4-Mini: 3.8B Open Source Model Release
Released on February 18, 2025
3.8B dense model outperforming 2x-size models (Phi-3.5-mini, Llama 3.2 3B)
128K context, 22 languages, function calling and tool use
Trained on 5T tokens (synthetic + filtered public data + code)
MIT license — smallest Phi model with strong reasoning
xAI Grok 3 Release: 100K GPUs Meet Advanced Reasoning
Released on February 17, 2025
Trained on Colossus supercluster (100K GPUs)
Strong reasoning capabilities
DeepSeek R1: The Open-Source Reasoning Revolution
Released on January 20, 2025
Open-source reasoning model rivaling o1
Pure reinforcement learning approach
Caused global market shockwaves
671B MoE architecture
Mistral Small 3.0: The Open-Source Frontier Model Arrives
Released on January 15, 2025
Refreshed Small with state-of-the-art performance
Apache 2.0 license
Allen AI Unveils OLMo 2: The New Standard in Open-Source LLMs
Released on January 6, 2025
Truly open: weights + training data + training code + evaluation all released
7B and 13B sizes — 7B competitive with Llama 3.1 8B, 13B with Gemma 2 9B
Trained on 4T–5T tokens, 9-point MMLU increase over OLMo 1
Apache 2.0 license
2024
DeepSeek V3: The $5.5M Model That Rivals GPT-4o
Released on December 26, 2024
671B MoE trained for $5.5M — matches GPT-4o/Claude 3.5 Sonnet
Revolutionized cost efficiency
Open-source on GitHub and HuggingFace
Strong coding and mathematical reasoning
Falcon 3 10B: The New Open-Source Powerhouse from TII
Released on December 17, 2024
1B/3B/7B/10B sizes
Enhanced multilingual and multimodal
Apache 2.0 license
Microsoft Unveils Phi-4: 14B Open-Source Powerhouse
Released on December 12, 2024
14B excelling at STEM reasoning
Outperforms much larger models on math
Gemini 2.0 Flash: Google's Agentic Leap into Multimodal Speed
Released on December 11, 2024
Google's model for the agentic era with native image and audio generation
Outperforms Gemini 1.5 Pro at twice the speed
Native tool use including Google Search and code execution
Foundation for Project Astra and Project Mariner
Meta Unveils Llama 3.3: 70B Model Matches 405B Performance
Released on December 6, 2024
70B matching Llama 3.1 405B performance
Massive efficiency gain
OpenAI o1-Pro Release: The New Standard for Complex Reasoning
Released on December 5, 2024
Enhanced reasoning with more compute for complex tasks
Available in ChatGPT Pro tier
Amazon Nova 2024: The New Enterprise LLM Standard
Released on December 3, 2024
Foundation model family: Micro/Lite/Pro/Premier
Multimodal, optimized for AWS Bedrock
Qwen2.5-Coder: Open Source Coding LLM Rivals GPT-4o
Released on November 22, 2024
Code-specialized model in 6 sizes: 0.5B, 1.5B, 3B, 7B, 14B, 32B
32B variant matches GPT-4o coding ability — state-of-the-art open code LLM
Trained on 5.5T tokens (source code + text-code grounding + synthetic)
300+ programming languages, 128K context with YaRN extension
Apache 2.0 license
Mistral Pixtral Large: The 124B Multimodal Open-Source Frontier
Released on November 17, 2024
Mistral's large multimodal model
128K context, native image understanding at scale
Open weights
Tencent Unveils Hunyuan-Large: 389B MoE Model Challenges Llama 3.1
Released on November 5, 2024
Largest open-source Transformer-based MoE model at release
389B total parameters with 52B active per token
256K context window
Outperforms Llama 3.1 405B on benchmarks
Anthropic Releases Claude Haiku 3.5: Fast & Efficient
Released on October 22, 2024
Fast and cost-effective model
200K token context window, 8K max output
Multilingual and vision capabilities
$0.80/M input, $4/M output
Ideal for high-volume tasks like chatbots and moderation
01.AI Yi-Lightning Release: Top-Tier Proprietary Model Analysis
Released on October 16, 2024
Ranked #6 on LMSYS Chatbot Arena at launch — #1 in China
Surpassed GPT-4o-0513 and Claude 3.5 Sonnet in overall ranking
Top-3 in Chinese, Math, Coding, and Hard Prompts categories
Founded by Kai-Fu Lee, proprietary model
Meta Unveils Llama 3.2: Multimodal Leap for Developers
Released on September 25, 2024
First Llama models with vision capabilities — 11B and 90B multimodal variants
Lightweight 1B and 3B edge models for on-device deployment
128K context window, competitive with Claude 3 Haiku and GPT-4o-mini
Drop-in replacements for Llama 3.1 text models
Qwen2.5 Release: The 72B Open-Source Coding Powerhouse
Released on September 19, 2024
0.5B to 72B range
SOTA open model for coding and math
18T training tokens
Apache 2.0 license
Mistral Small 2409: The 22B Open-Source Powerhouse Released September 2024
Released on September 18, 2024
Updated Mistral Small with improved instruction following
22B parameters, Apache 2.0 license
Mistral Pixtral 12B: The Open-Source Multimodal Breakthrough
Released on September 17, 2024
Built on NeMo architecture with native vision support
128K context, Apache 2.0 license
OpenAI o1-Preview: The Reasoning Model Revolution
Released on September 12, 2024
First 'reasoning' model with chain-of-thought at inference
PhD-level science and math performance
DeepSeek V2.5 Release: The 236B MoE Powerhouse for Developers
Released on September 5, 2024
Merged DeepSeek-V2-Chat and DeepSeek-Coder-V2 into a single model
236B MoE with 21B active parameters, 128K context
Strong coding and general capabilities in one model
MIT license, available on HuggingFace
Jamba 1.5: The New Hybrid MoE Standard for Long Context
Released on August 22, 2024
Mamba-Transformer hybrid MoE
94B active, 256K context
Fastest long-context model at release
Phi-3.5 Release: Microsoft's 4B MoE Model for Edge AI
Released on August 20, 2024
4B MoE and 3.8B variants optimized for edge devices
Phone-capable AI with 128K context window
Improved multilingual support over Phi-3
Strong reasoning for its size class
xAI Grok-2 Release: Competing with GPT-4o and Claude 3.5
Released on August 13, 2024
Competitive with GPT-4o and Claude 3.5 Sonnet
Available on X platform
HyperCLOVA X: Naver's 104B Korean LLM Review
Released on August 7, 2024
Korean web giant Naver's flagship LLM optimized for Korean language and culture
Two sizes: HCX-L (largest) and HCX-S (lighter), built on LLaMA 2 architecture
100K context window with Korean-optimized tokenizer
Strong cross-lingual reasoning in Asian languages — Korean, Japanese, Chinese
FLUX.1 by Black Forest Labs: The Open Source Image King
Released on August 1, 2024
State-of-the-art text-to-image model from ex-Stability AI founders
12B rectified flow transformer architecture
FLUX.1 [schnell] open under Apache 2.0, [dev] non-commercial
Surpassed closed-source alternatives in image quality
Mistral Large 2: The 123B Open-Weight Frontier Model
Released on July 24, 2024
128K context, competitive with GPT-4o and Llama 3.1 405B
12 languages supported
Open weights
Meta Llama 3.1: The 405B Open-Source Benchmark
Released on July 23, 2024
Largest open model — 405B parameters
Matches GPT-4 on many benchmarks
128K context window
Mistral NeMo 12B: The New Standard for Efficient Open-Source AI
Released on July 18, 2024
Co-built with NVIDIA, runs on a single GPU
12B parameters with 128K context window
Drop-in replacement for Mistral 7B with SOTA performance in its class
Apache 2.0 license, strong multilingual support
InternLM 2.5 Release: Open-Source Reasoning Powerhouse from Shanghai AI Lab
Released on July 3, 2024
Strong reasoning from China's national lab
Competitive on math and coding
Gemma 2 Release: Google's New Open-Source AI Model
Released on June 27, 2024
9B and 27B sizes
Outperforms models 2x its size
Knowledge distillation from Gemini
Anthropic Unveils Claude 3.5 Sonnet: The Coding Powerhouse
Released on June 20, 2024
Surpassed GPT-4o and Gemini 1.5 Pro at launch
2x faster than Claude 3 Opus at lower cost
DeepSeek Coder V2: The Open-Source GPT-4 Turbo Rival
Released on June 17, 2024
First open MoE code model matching GPT-4 Turbo on coding
338 programming languages supported
NVIDIA Unveils Nemotron-4 340B: The Open-Source Powerhouse for Synthetic Data
Released on June 14, 2024
NVIDIA's open model for synthetic data generation
Permissive enterprise license
Qwen2 Release: The 72B Open-Source Challenger to Llama 3
Released on June 7, 2024
Major upgrade, 0.5B to 72B range
Competitive with Llama 3 70B
Apache 2.0 license
GLM-4 by Zhipu AI: 9B Parameter Open-Source Powerhouse
Released on June 5, 2024
128K context, 26 languages
Competitive with Llama 3 8B
Open-source GLM-4 series
Codestral by Mistral AI: The Open-Source Code Model Revolution
Released on May 29, 2024
Specialized code model, 80+ languages
32K context, fill-in-the-middle support
Doubao Seed 1.5: ByteDance's Open-Source LLM Powerhouse
Released on May 15, 2024
ByteDance's flagship LLM, most popular AI product in China
Available via Doubao app and Volcano Engine API
Supports 50+ application scenarios including voice, vision, and coding
Open-source Seed 1.5 variants released under permissive license
OpenAI GPT-4o: The Multimodal AI Milestone
Released on May 13, 2024
'Omni' model with native audio/vision/text
2x faster, 50% cheaper than GPT-4 Turbo
Real-time voice conversation capabilities
DeepSeek V2 Release: 236B MoE Open Source Power Unleashed
Released on May 7, 2024
236B MoE with only 21B active parameters
Multi-head Latent Attention for efficiency
Open weights
Snowflake Arctic: The Enterprise-Grade Open-Source LLM for SQL & Code
Released on April 24, 2024
480B MoE with 17B active parameters
Enterprise-focused, strong on SQL and coding
Apache 2.0 license
Phi-3 Release: Microsoft's 14B Open-Source Powerhouse
Released on April 23, 2024
Mini/Small/Medium variants
Phi-3 Mini (3.8B) rivals Mixtral 8x7B
Phone-capable AI
Meta Unveils Llama 3: The New Open-Source Standard
Released on April 18, 2024
Trained on 15T tokens, 8B and 70B sizes
New open-source SOTA with massive community adoption
Mixtral 8x22B: Mistral AI's 176B Open-Source Mixture of Experts Model Delivers Enterprise-Level Performance
Released on April 17, 2024
Large MoE with strong multilingual and code performance
Open weights
Cohere's Command R+ 104B: Enterprise RAG Powerhouse with 128K Context
Released on April 4, 2024
Optimized for RAG and enterprise
128K context, 10 languages
Grounded generation capabilities
Jamba 52B: AI21's Revolutionary Open-Source Mamba-Transformer Hybrid Model
Released on March 28, 2024
First production Mamba-Transformer hybrid
256K context, novel SSM architecture
DBRX 132B MoE: Databricks' Open-Source AI Challenger Surpasses Llama 2 70B
Released on March 27, 2024
Open MoE with 36B active parameters
Outperformed Llama 2 70B and Mixtral
Apache 2.0 license
Grok-1 Released: xAI's 314B Parameter Open-Source Model Breaks New Ground
Released on March 17, 2024
xAI's first open-source model
314B MoE under Apache 2.0
Largest open MoE at time of release
Claude 3 by Anthropic: The Game-Changing Language Model That Rivals GPT-4
Released on March 4, 2024
Haiku/Sonnet/Opus family
Opus matched GPT-4 on most benchmarks
200K context window, vision capabilities
Claude Opus 3: Anthropic's Milestone Reasoning Model Breaks New Ground
Released on March 4, 2024
First Claude Opus model with advanced reasoning
200K context window
Pioneered extended thinking capabilities
Vision and tool use support
Mistral Large: Mistral AI's Flagship Commercial Model Breaks New Ground
Released on February 26, 2024
Mistral's first flagship commercial model
32K context, top-tier reasoning
Google DeepMind's Gemma: The Open-Source AI Revolution Starts with 7B Parameters
Released on February 21, 2024
Google's open-source model from Gemini research
2B and 7B sizes, strong for its class
Gemini 1.5 Pro: Google DeepMind's Revolutionary 1M Token Multimodal AI Breakthrough
Released on February 15, 2024
1 million token context window — 10x previous record
MoE architecture, processes entire codebases
Gemini 1.0 Ultra: Google's Most Capable Multimodal AI Model
Released on February 8, 2024
Most capable Gemini 1.0 model
Beat GPT-4 on 30/32 benchmarks
Powers Gemini Advanced
StableLM 2: Stability AI's New Open-Source LLMs Challenge Industry Giants
Released on February 6, 2024
Open language model in two sizes: 1.6B and 12B
Trained on 2T tokens (Falcon RefinedWeb, RedPajama, The Pile, CulturaX)
Competitive with Mistral-7B despite smaller footprint
Stability AI Community License
StarCoder 2: Revolutionary Open-Source Code Generation Models with 3B, 7B, and 15B Parameters
Released on February 6, 2024
Open code LLM in 3 sizes: 3B, 7B, 15B — trained on 4T+ tokens from The Stack v2
600+ programming languages, fill-in-the-middle capability
16K context with sliding window attention
Trained on permissively licensed code only
2023
SOLAR 10.7B: Upstage's Revolutionary Open-Source Model Dominates HuggingFace Leaderboards
Released on December 13, 2023
Korean startup Upstage's open model using depth up-scaling
Topped HuggingFace Open LLM Leaderboard at release
Apache 2.0 license
Mixtral 8x7B: The Open-Source Mixture of Experts Revolution That Matches GPT-3.5
Released on December 11, 2023
Open-source MoE matching GPT-3.5 quality with only 12.9B active params
Game-changer for open-source efficiency
Apache 2.0 license
Gemini 1.0: Google DeepMind's Revolutionary Multimodal AI Model
Released on December 6, 2023
Google's multimodal model family (Nano/Pro/Ultra)
Natively multimodal from training
Nous Hermes 2: The Open-Source LLM That's Revolutionizing Local AI Deployment
Released on November 13, 2023
Community fine-tuned model on Mistral/Yi
Strong at instruction following
Popular for local AI
Yi 34B: The Bilingual Open-Source LLM That's Outperforming Llama 2 70B
Released on November 2, 2023
Founded by Kai-Fu Lee
Strong bilingual (English/Chinese) model
Competitive with Llama 2 70B
ChatGLM3-6B: Zhipu AI's Third-Generation Open-Source Model with Advanced Agent Capabilities
Released on October 27, 2023
Third gen GLM with function calling, code interpreter, and agent capabilities
Zephyr 7B: HuggingFace's Game-Changing Open-Source Model Built on Mistral
Released on October 25, 2023
Mistral 7B fine-tuned with DPO
Showed distilled alignment can match RLHF quality
Mistral 7B: The Open-Source AI Model That Redefined Performance Expectations
Released on September 27, 2023
Outperformed Llama 2 70B on all benchmarks despite being smaller
Sliding window attention
Apache 2.0 license
Qwen 72B: Alibaba's Open-Source Giant Challenges AI Leaders with Multilingual Powerhouse
Released on September 25, 2023
Alibaba's multilingual model series
Strong on Chinese and English tasks
Open weights
WizardCoder 34B: Revolutionary Open-Source Coding Model Surpasses GPT-3.5 Performance
Released on August 26, 2023
Evol-Instruct tuned Code Llama
Top open-source coding model of its era
Strong on HumanEval
Code Llama 34B: Meta's Specialized Coding Model Revolutionizes AI-Assisted Development
Released on August 24, 2023
Specialized Llama 2 for code generation
Supports Python, C++, Java, and more
100K context window
Llama 2: How Meta's Open-Source Milestone Revolutionized AI Development
Released on July 18, 2023
First truly open-weight large model for commercial use
7B/13B/70B sizes with RLHF-tuned chat variants
Founded the modern open LLM ecosystem
Claude 2 Review: Anthropic's Breakthrough Language Model with Constitutional AI
Released on July 11, 2023
200K context window
Constitutional AI approach
Strong coding and analysis capabilities
ChatGLM2: Zhipu AI's 6B Parameter Powerhouse Delivers 42% Faster Inference
Released on June 25, 2023
Second generation GLM, 32K context
42% faster inference
Stronger math and coding
Falcon 180B: The New Open-Source Giant That's Redefining AI Performance
Released on May 25, 2023
Trained on 3.5T tokens of RefinedWeb
Topped the Open LLM Leaderboard
Apache 2.0 license
Google PaLM 2: The 340B Parameter Language Model Revolutionizing AI
Released on May 10, 2023
Google's next-gen model powering Bard/Gemini
Improved multilingual, reasoning, and coding
MPT-7B: The Open-Source Transformer Revolution from MosaicML
Released on May 5, 2023
Commercially usable open-source model
Trained on 1T tokens
Apache 2.0 license
StarCoder 15.5B: The Open-Source Code Generation Revolution by BigCode
Released on May 4, 2023
Open-source code LLM trained on The Stack (1T tokens, 80+ languages)
8K context window
StableLM 7B: Stability AI's Open-Source Language Model Revolution
Released on April 19, 2023
Stability AI's open-source LLM family
3B and 7B sizes, trained on 1.5T tokens
CC-BY-SA license
Vicuna 13B: The Open-Source Chatbot That Achieves 90% ChatGPT Quality
Released on March 30, 2023
Fine-tuned LLaMA on ShareGPT conversations
Achieved ~90% of ChatGPT quality
Launched the Chatbot Arena
Claude 1 Review: Anthropic's First Public Language Model Breaks New Ground in AI Safety
Released on March 14, 2023
Anthropic's first public model
Constitutional AI for safety
100K context window
GPT-4: OpenAI's Revolutionary Multimodal AI That Changed Everything
Released on March 14, 2023
Multimodal (text + vision), passed the bar exam (90th percentile)
Massive leap in reasoning over GPT-3.5
~1.8T parameters (MoE estimated)
Stanford's Alpaca 7B: How a $600 Fine-Tune Achieved GPT-3.5-Level Performance
Released on March 13, 2023
Fine-tuned LLaMA on 52K instructions generated by GPT-3.5
Showed cheap instruction tuning works
LLaMA 1: The Revolutionary Open-Source Model That Changed AI Forever
Released on February 24, 2023
Leaked weights ignited the open-source LLM revolution
Showed small models can match GPT-3
65B parameters
2022
ChatGPT: The Revolutionary Language Model That Ignited the AI Era
Released on November 30, 2022
GPT-3.5 with RLHF in a chat interface
Reached 100M users in 2 months
Defined the AI era
Flan-T5: Google's Instruction-Tuned T5 Model Revolutionizes Few-Shot Learning
Released on October 20, 2022
Instruction-tuned T5
Demonstrated instruction tuning dramatically improves task generalization
BLOOM: The 176B Parameter Revolution That Democratized AI in 2022
Released on July 6, 2022
First 100B+ open-source multilingual model
Built by 1000+ researchers across 70+ countries
46 languages supported
Meta's OPT Model: The Open-Source Alternative to GPT-3 That Changed AI Research
Released on May 3, 2022
Meta's open-source GPT-3 equivalent
Full model weights released for research
175B parameters
GPT-NeoX: EleutherAI's 20B Breakthrough That Changed Open-Source LLMs Forever
Released on April 14, 2022
EleutherAI's 20B open model
First glimpse that local LLMs could scale to GPT-3 territory
Predecessor to today open-source ecosystem
Google's PaLM: The 540B Parameter Language Model That Changed Everything
Released on April 4, 2022
540B parameter model
Breakthrough capabilities in reasoning, code, and multilingual tasks
Chinchilla: How Google DeepMind Revolutionized LLM Scaling Laws in 2022
Released on March 29, 2022
Proved smaller models trained on more data outperform larger undertrained ones
Redefined scaling laws for LLMs
InstructGPT: The Revolutionary Language Model That Changed AI Alignment Forever
Released on January 27, 2022
Introduced RLHF for alignment
Pioneered training models to follow human instructions safely
2021
Gopher: Google DeepMind's 280B Parameter Breakthrough That Changed NLP Forever
Released on December 8, 2021
280B parameter model
Extensive analysis of scaling laws across 152 tasks
OpenAI Codex: The Revolutionary Coding Model That Changed Everything
Released on August 10, 2021
GPT-3 fine-tuned on code
Powered GitHub Copilot
Proved LLMs could write functional programs
GPT-J: The Game-Changing 6B Parameter Open-Source Model That Democratized Large Language Models
Released on June 9, 2021
First open model runnable on consumer hardware
6B params, GPT-2 architecture
Widely deployed in early local AI applications
Google's Switch Transformer: The 1.6 Trillion Parameter Breakthrough That Changed AI Scaling Forever
Released on January 11, 2021
1.6 trillion parameter MoE model
Demonstrated efficient scaling through sparse expert routing
2020
GShard: Google's Revolutionary 600B Parameter Mixture of Experts Language Model
Released on June 30, 2020
First Mixture of Experts model at massive scale
600B parameters for machine translation
GPT-3: The 175-Billion Parameter Revolution That Changed AI Forever
Released on May 28, 2020
175B parameters — demonstrated few-shot learning without fine-tuning
Sparked the modern LLM revolution
2019
T5: How Google's Text-to-Text Transformer Revolutionized NLP Architecture
Released on October 23, 2019
Text-to-Text Transfer Transformer
Unified framework treating all NLP tasks as text generation
RoBERTa: Meta AI's Revolutionary BERT Optimization Breakthrough
Released on July 26, 2019
Robustly Optimized BERT
Showed BERT was significantly undertrained
Achieved new SOTA with better training
XLNet: The Revolutionary Autoregressive Language Model That Surpassed BERT
Released on June 19, 2019
Generalized autoregressive pretraining
Outperformed BERT on 20 NLP tasks
GPT-2: The Language Model That Changed Everything in AI History
Released on February 14, 2019
Initially withheld due to misuse concerns — "Too dangerous to release"
Showed emergent text generation quality at scale
2018
BERT: The Revolutionary Language Model That Changed NLP Forever
Released on October 11, 2018
Bidirectional Encoder Representations from Transformers
Revolutionized NLP benchmarks
Became the foundation for search engines
GPT-1: The Revolutionary Foundation That Started the Modern LLM Era
Released on June 11, 2018
First GPT model — decoder-only transformer
Demonstrated generative pre-training for language understanding
ELMo: The Groundbreaking Contextual Language Model That Changed NLP Forever
Released on February 15, 2018
Embeddings from Language Models
Contextualized word representations using bidirectional LSTMs
2017
Transformer: The Revolutionary Architecture That Changed AI Forever
Released on June 12, 2017
'Attention Is All You Need' paper introduces the Transformer architecture
The foundation of all modern LLMs