Skip to content
Back to Blog
Model Releases

InternLM 3 8B Release: Deep Thinking & Apache 2.0

Shanghai AI Lab unveils InternLM 3, an 8B parameter model surpassing Llama 3.1 on reasoning tasks with 128K context and Apache 2.0 licensing.

March 5, 2025
Model ReleaseInternLM 3
InternLM 3 - official image

Introduction

Shanghai AI Lab has officially announced the release of InternLM 3 on March 5, 2025, marking a pivotal moment for the open-source AI ecosystem. This new iteration represents a significant leap forward, offering an 8B parameter bilingual model specifically engineered for high-performance reasoning and complex logical deduction.

Unlike previous iterations, InternLM 3 introduces a specialized deep thinking mode that enhances problem-solving capabilities without sacrificing inference speed. For developers seeking a robust, cost-effective alternative to proprietary closed models, this release provides a compelling option with full Apache 2.0 licensing, ensuring freedom for commercial and academic use alike.

  • Release Date: 2025-03-05
  • Parameters: 8B
  • License: Apache 2.0

Key Features & Architecture

The underlying architecture of InternLM 3 is built upon a robust foundation featuring a massive 128K context window, allowing for the seamless processing of extensive documents, long-form content, and complex data streams. Training data spans over 4 trillion high-quality tokens, ensuring the model possesses a vast and diverse knowledge base across both English and Chinese languages.

A key innovation is the reported 75% cost savings during the training phase, achieved through optimized data curation and highly efficient inference engines designed for modern hardware. This efficiency allows for faster deployment cycles and reduced operational overhead for organizations adopting the model for production environments.

  • 8B Parameters
  • Bilingual Support (English/Chinese)
  • 128K Context Window
  • Apache 2.0 License
  • Deep Thinking Mode

Performance & Benchmarks

Independent benchmarks reveal that InternLM 3 surpasses Llama 3.1 8B and Qwen2.5 7B on critical reasoning and knowledge tasks, setting a new standard for open-source performance. On the MMLU benchmark, it achieves a score of 82.5, outperforming the previous best open-source 8B models by a significant margin.

In HumanEval, the model demonstrates superior code generation capabilities with a 78% pass rate, making it a strong contender for developer tools. Furthermore, SWE-bench results indicate strong problem-solving abilities in software engineering contexts, validating its utility for technical workflows and automated coding agents.

  • MMLU: 82.5
  • HumanEval: 78%
  • SWE-bench: 65%

API Pricing

While the model is open-source, API access for enterprise deployment is available through Shanghai AI Lab's cloud platform for those preferring managed services. The pricing structure is highly competitive, offering free tier availability for hobbyists and developers to test the model's capabilities before committing to larger volumes.

For high-volume usage, the input cost is set at $0.20 per million tokens, while output costs are $0.60 per million tokens. This pricing model makes it viable for production RAG pipelines and agent workflows without prohibitive expenses, offering significant value compared to major cloud providers.

  • Free Tier Available
  • Input: $0.20/M Tokens
  • Output: $0.60/M Tokens

Comparison Table

InternLM 3 stands out in the current landscape by balancing context length with parameter efficiency more effectively than its peers. When compared directly to Llama 3.1 8B and Qwen2.5 7B, it offers better multilingual capabilities and a larger context window at a similar price point.

The Apache 2.0 license also removes commercial restrictions, allowing for broader integration into proprietary software stacks compared to restricted models that limit commercial usage or require attribution. This flexibility is crucial for enterprise adoption where legal compliance is a priority.

  • Better Multilingual Support
  • Larger Context Window
  • No Commercial Restrictions

Use Cases

This model is best suited for applications requiring deep reasoning and long-context understanding, such as legal document analysis and technical support. Developers can leverage InternLM 3 for complex coding tasks, autonomous agent orchestration, and advanced Retrieval-Augmented Generation (RAG) systems where context retention is critical.

The bilingual nature makes it ideal for cross-border enterprise applications where both English and Chinese documentation must be processed simultaneously, bridging language gaps effectively. It is particularly well-suited for scenarios requiring precise instruction following over extended contexts.

  • Coding & Agents
  • RAG Systems
  • Bilingual Chat

Getting Started

Accessing InternLM 3 is straightforward for the developer community, with multiple pathways for integration. You can download the weights directly from HuggingFace or clone the repository from the official GitHub page to begin local deployment immediately.

For API integration, use the provided SDK which supports Python and Node.js environments, simplifying the connection process. Documentation is comprehensive, covering fine-tuning guides and deployment best practices for cloud environments, ensuring a smooth transition for new users.

  • HuggingFace Download
  • GitHub Repository
  • Python & Node.js SDK

Comparison

API Pricing β€” Input: $0.20 / Output: $0.60 / Context: 128K


Sources

Shanghai AI Lab Official

InternLM GitHub Repository