
Holo3.1 Brings Fast, Private Computer-Use AI Agents to Your Own Machine
H Company's open-weight Holo3.1 agents automate desktop and mobile tasks locally, with sizes from 0.8B to 35B and quantized builds that run on consumer hardware.
The most interesting frontier in AI right now isn't bigger chatbots — it's agents that can actually use a computer. This week H Company released *Holo3.1*, an open-weight family of computer-use models that click, type, and navigate interfaces on your behalf, and crucially, can do it entirely on your own hardware.
A Full Range of Open-Weight Agent Models
Holo3.1 ships in four sizes — 0.8B, 4B, 9B, and a 35B-A3B mixture-of-experts model that activates only about 3B parameters per step. That spread is deliberate: the tiny 0.8B model can run on modest devices, while the 35B variant targets the most demanding automation tasks. All of them are published openly on Hugging Face, so developers can download, inspect, and fine-tune them freely.
Mobile Automation Joins the Lineup
The headline addition in this release is mobile GUI automation. On the AndroidWorld benchmark, the 35B-A3B model jumped from 67% to 79.3%, while the 4B and 9B variants climbed from 58% to 72%. That means a single open model family can now drive both desktop and smartphone interfaces — a meaningful step toward general-purpose, on-device automation agents.
Built to Run Locally and Privately
What earns Holo3.1 a place in any self-hosting enthusiast's toolkit is how seriously it takes local inference. H Company ships quantized checkpoints in FP8, NVFP4, and Q4 GGUF formats so the models fit on consumer GPUs and laptops. The NVFP4 build delivers 1.74x the throughput of BF16 while giving up only about two points of accuracy.
On an NVIDIA DGX Spark, end-to-end average step time dropped from 6.8 seconds under FP8 to 3.3 seconds with NVFP4 plus optimizations — roughly a 2x speedup. For an agent that may take dozens of steps to complete a task, halving per-step latency is the difference between a demo and a daily driver.
Why Local Computer-Use Agents Matter
Running automation agents locally keeps your screen contents, keystrokes, and credentials on your own machine instead of streaming them to a cloud API. Holo3.1 also adds native function-calling protocol support alongside JSON output and posts a 25%+ cross-harness improvement over the previous Holo3, making it easier to wire into existing developer tooling.
Open weights, real mobile support, and quantized builds tuned for consumer hardware add up to a genuinely democratizing release. The future of AI agents doesn't have to live exclusively in the cloud — and Holo3.1 is a strong argument for that.
Sources: H Company / Hugging Face blog, "Holo3.1" (June 2, 2026); Hugging Face model collection (June 2, 2026).
