Microsoft's MDASH Sends 100+ AI Agents to Find and Prove Bugs — Now Wired Into Defender

At Build 2026 on June 2, Microsoft expanded MDASH, its 100-plus-agent vulnerability hunter, integrating it with Defender after a 96.55% score on the CyberGym benchmark.

Kai Aegis★Jun 4, 2026★6 min read

A Swarm of AI Agents That Proves Bugs Before Alerting Humans

Let me walk you through one of the most encouraging defensive-security stories of the week. At Build 2026 on June 2, 2026, Microsoft expanded the preview of MDASH — its Microsoft Security multi-model agentic scanning harness — and wired it directly into the Defender portal. MDASH is an autonomous system that orchestrates more than 100 specialized AI agents to find and confirm exploitable code vulnerabilities, and the key word there is confirm. This is AI built to help defenders fix real problems, not to chase false alarms.

Here is how the pipeline works, in plain terms. Auditor agents generate hypotheses about where a vulnerability might live. Debater agents then argue both sides — for and against whether the flaw is actually exploitable. Finally, prover agents build a working triggering input that demonstrates the bug is real before any human is alerted. That find-debate-prove structure is what keeps the noise down and the signal high.

A Benchmark Jump That Shows Rapid Progress

The numbers are striking. MDASH scored 96.55% on UC Berkeley's CyberGym benchmark — a demanding suite of 1,507 real-world vulnerability tasks across 188 open-source projects. What makes that figure impressive is the trajectory: it is up from 88.45% just three weeks earlier, at the system's May 12 reveal. An eight-point gain in under a month signals how fast defensive AI tooling is maturing.

Why "Proving" a Vulnerability Matters So Much

If you have ever managed a security backlog, you know the real enemy is noise. A scanner that flags a thousand maybe-bugs just buries your team. By requiring a prover agent to build a working trigger before escalating, MDASH dramatically shrinks the false-positive problem. Defenders get a confirmed, reproducible finding instead of a hunch — which means the find-to-fix gap gets a lot shorter.

Findings Flow Straight Into Defender and Copilot Autofix

The Build 2026 announcement also made MDASH's results genuinely actionable. Findings now flow natively into the Defender portal, enriched with production risk signals like internet exposure and data sensitivity, so teams can triage by real-world impact. Microsoft also connected GitHub Copilot Autofix, giving developers AI-assisted remediation without leaving their tools. Alongside this, the Defender and GitHub Code Security integration reached general availability, and new Defender model-scanning can vet AI models before deployment.

This is a textbook win for the defenders. AI is being used to discover and prove vulnerabilities, route them into the workflows security teams already use, and even draft the fix. As local and cloud AI keep advancing, it is genuinely reassuring to see that much of that capability is being aimed squarely at making software safer.

Sources: Microsoft Security Blog (June 2, 2026); Tech Times (June 3, 2026); Windows Report (June 2026).