AMD Strix Halo Mini PCs Are Here — And They Can Run 120-Billion-Parameter AI Models Locally

A wave of AMD Ryzen AI Max+ 395 mini PCs is shipping with 128GB unified memory, bringing serious local AI inference to a box on your desk.

Alex Circuit★Feb 25, 2026★5 min read

Something remarkable is happening in the mini PC space: you can now buy a desktop box the size of a thick paperback that runs 120-billion-parameter language models without breaking a sweat. The AMD Ryzen AI Max+ 395 Strix Halo platform has arrived, and multiple manufacturers are already shipping production units.

The Hardware That Changes Local AI

The Ryzen AI Max+ 395 is currently the most powerful x86 APU on the market for AI workloads. It pairs 16 Zen 5 CPU cores with a Radeon 8060S integrated GPU and up to 128GB of unified LPDDR5X memory running at 8000 MHz. That unified memory architecture is the key feature — it means both the CPU and GPU share the same massive memory pool, which is exactly what large language models need for efficient inference.

Benchmarks from early adopters show the platform handling large 120-billion-parameter models at over 21 tokens per second. Llama-based 70B models run comfortably with room to spare. This is the kind of performance that previously required a dedicated GPU server or cloud instance.

Which Mini PCs Are Shipping Now

The OEM ecosystem has mobilized quickly. GMKtec’s EVO-X2 is already available with 128GB LPDDR5X and 2TB NVMe storage. Minisforum showcased both the MS-S1 MAX workstation and the BD395i MAX Mini-ITX motherboard at CES 2026 for builders who want to choose their own enclosure. Thunderobot debuted a cube-style design, and Beelink has the GTR9 Pro in its lineup.

Prices range from roughly $1,000 for barebones configurations to $2,500 or more for fully loaded systems — expensive by traditional mini PC standards, but a fraction of what comparable AI inference hardware costs through cloud compute.

AMD’s Own Ryzen AI Halo Reference Platform

AMD also announced the Ryzen AI Halo, a palm-sized reference platform designed specifically for AI developers. Running both Windows and Linux with full ROCm support, it positions AMD as a direct competitor to Nvidia’s DGX Spark. The Ryzen AI Halo is expected to launch in Q2 2026 at a price point significantly below the DGX Spark.

Why Running AI Locally Matters

Running AI models locally offers privacy, zero latency to the cloud, no per-token API costs, and the ability to work entirely offline. For developers, researchers, and enterprises with sensitive data, local inference is not just convenient — it is essential.

The Strix Halo generation of mini PCs represents a genuine inflection point: the moment when running your own AI stopped being a hobby project and became a practical desktop workflow.

Sources: VideoCardz, February 2026; TechRadar, February 2026; Tweaktown, January 2026; XDA Developers, February 2026