Liquid AI's new 230M parameter LFM2.5-230M is redefining on-device AI, delivering high-speed agentic performance on everything from smartphones to robotics.

The era of massive, cloud-dependent LLMs is meeting its most significant challenger: the hyper-efficient edge model. On June 25, 2026, Liquid AI released LFM2.5-230M, a groundbreaking small language model (SLM) designed to bring sophisticated intelligence directly to the hardware it controls.
Unlike previous generations of small models that struggled with complex reasoning, LFM2.5-230M is purpose-built for agentic workflows. It isn't just a chatbot; it is a lightweight reasoning engine capable of running on CPUs, NPUs, and GPUs, making it the ideal candidate for the next wave of autonomous devices, from smartphones to industrial robots.
At the heart of LFM2.5-230M is the advanced LFM2 architecture, optimized for extreme efficiency without sacrificing the depth of understanding. Despite its diminutive 230M parameter count, the model is a powerhouse of information density, having been pre-trained on a massive 19 trillion tokens.
To ensure high-quality performance at this scale, Liquid AI utilized a sophisticated post-training regime involving distillation from the larger LFM2.5-350M model. This allows the 230M variant to inherit complex reasoning patterns and instruction-following capabilities that typically require much larger parameter counts.
The most striking aspect of LFM2.5-230M is how it punches above its weight class. In rigorous testing, it consistently competes with—and often outperforms—models that are more than twice its size in critical areas such as instruction following, data extraction, and tool use.
On hardware-constrained devices, the throughput is nothing short of revolutionary. On a Samsung Galaxy S25 Ultra (utilizing the Snapdragon Gen4), the model achieves a staggering 213 tokens per second (tok/s) on the CPU. Even on a low-power Raspberry Pi 5, it maintains a usable 42 tok/s, delivering the highest prefill and decode throughput in its class while maintaining the smallest memory footprint.
Liquid AI has already demonstrated the model's practical utility through an impressive robotics demo. Deployed on a Unitree G1 robot, LFM2.5-230M ran entirely on-device via an onboard Jetson Orin. In this setup, the model acted as a 'skill-selection layer,' taking natural language instructions and decomposing them into structured, multi-step tool-call plans.
For enterprise users, Liquid AI offers an internal GPU inference stack that ensures extremely low-latency serving. When deployed via SGLang, LFM2.5-230M achieves significantly lower end-to-end latency compared to other small models across all concurrency levels, making it ready for production-grade pipelines.
Developers should look toward LFM2.5-230M for workloads where latency, privacy, and cost are paramount. It is exceptionally well-suited for large-scale data extraction pipelines where processing millions of documents locally can save massive cloud costs.
Furthermore, its ability to run on edge hardware makes it the gold standard for lightweight on-device agentic workloads. Whether it is controlling home automation, managing network devices, or powering the 'brain' of a consumer robot, this model provides the intelligence needed without the round-trip delay of a cloud API.
Liquid AI has ensured that LFM2.5-230M is accessible across the entire developer ecosystem. Whether you are working on mobile, desktop, or server-side deployments, there is a format ready for you. The model and its base version (LFM2.5-230M-Base) are available for immediate download.
The model is supported by all major deployment frameworks, ensuring seamless integration into your existing CI/CD pipelines. From edge-optimized GGUF files to high-performance GPU serving, the barrier to entry has never been lower.
As LFM2.5-230M is primarily designed for local and edge deployment, official cloud API pricing for this specific model is currently N/A. However, its efficiency suggests that any hosted version would be significantly more cost-effective than larger counterparts.
API Pricing — Context: 32K