MiniMax M3: An Open-Weight Model With Frontier Coding and a 1M-Token Context

MiniMax M3, released June 1, 2026, is an open-weight LLM pairing frontier-level coding, a 1-million-token context window, and native multimodality — and the weights are coming to Hugging Face.

Dr. Nova Chen★Jun 6, 2026★5 min read

Why MiniMax M3 Is a Milestone for Open-Weight AI

On June 1, 2026, the lab MiniMax released MiniMax M3, and it is one of the most interesting open-weight model launches of the year. The pitch is simple but ambitious: M3 is designed to be the first open-weight model to combine three capabilities that, until now, lived almost exclusively behind proprietary APIs — frontier-level coding, a 1-million-token context window, and native multimodality trained on mixed text and image data from the ground up. For anyone following the steady democratization of large language models, this is the kind of release that widens what builders can do on their own infrastructure.

What makes M3 notable is not a single headline number but the combination. Plenty of open-weight models do one of these things well. Bringing all three into one openly released system — with a promise to publish the full weights and a technical report — is what moves the conversation forward.

Frontier Coding and Agentic Web Search Benchmarks

MiniMax reports that M3 scores 59% on SWE-Bench Pro, placing it ahead of several leading proprietary systems on that coding benchmark and just behind the current frontier. On BrowseComp, a test of autonomous web search and multi-step reasoning, MiniMax reports a score of 83.5, edging out strong proprietary baselines. As always with self-reported figures at launch, the right posture is measured optimism: these are MiniMax's numbers, and independent replication will come once the weights and technical report are public. But even discounted, they point to a capable agentic coder.

MiniMax Sparse Attention Tames the Long-Context Cost

The 1-million-token context window is enabled by a new architecture the team calls MiniMax Sparse Attention (MSA). Instead of attending to every prior token, MSA uses a lightweight index branch to select only the relevant blocks of past context for each query. MiniMax says this cuts per-token compute to roughly one-twentieth of the previous generation at the 1-million-token length. That efficiency is the whole game for long context — a giant window is only useful if it is affordable to actually fill, and sparse attention is how you keep the bill reasonable.

What Open Weights Mean for Developers

API pricing lands at a low $0.30 per million input tokens and $1.20 per million output tokens, but the bigger story for the open-weight community is the commitment to release full weights and a technical report on Hugging Face and GitHub. That means self-hosting, fine-tuning, and auditing — the things that make a model genuinely yours. For teams building long-context agents, document-analysis pipelines, or multimodal coding assistants, an openly licensed model with this profile is a meaningful new option. As we often note in our AI coverage, the most durable progress tends to come when capable models become things anyone can run, study, and extend.

Sources: MiniMax Research blog, "MiniMax M3: Frontier Coding, 1M Context, Native Multimodality" (June 1, 2026); The Decoder, "MiniMax M3: Open-weight model with a million-token context challenges proprietary leaders" (June 1, 2026).