Microsoft Debuts Bug Looking 100-Agent AI System


Synthetic Intelligence & Machine Studying
,
Subsequent-Technology Applied sciences & Safe Growth
,
The Way forward for AI & Cybersecurity

Computing Big Touts Multi-Agentic ‘MDASH’ Strategy as Superior to Single Fashions

Microsoft Debuts Bug Hunting 100-Agent AI System
Picture: Samuel Boivin/Shutterstock

Microsoft says its new strategy to discovering vulnerabilities with synthetic intelligence brokers outclasses the only fashions touted by Anthropic and OpenAI.

See Additionally: Context Drives Security in Agentic AI Era

The computing giant in a Tuesday blog post mentioned it orchestrated greater than 100 specialised AI brokers “throughout an ensemble of frontier and distilled fashions” to find 16 new vulnerabilities within the Home windows networking and authentication stack.

The corporate refers back to the “multi-model agentic scanning harness” system as MDASH.

“The strategic implication is evident: AI vulnerability discovery has crossed from analysis curiosity into production-grade protection at enterprise scale, and the sturdy benefit lies within the agentic system across the mannequin moderately than any single mannequin itself,” wrote Taesoo Kim, vice chairman of safety analysis at Microsoft.

Of the 16 vulnerabilities discovered, 4 are “essential distant code execution flaws in parts such because the Home windows kernel TCP/IP stack” and the IKEv2 key administration protocol, the corporate reported. Microsoft patched the failings as a part of its most up-to-date month-to-month dump of software program fixes. AI is accelerating “the size and velocity of vulnerability discovery,” wrote Tom Gallagher, who leads Microsoft’s Microsoft Safety Response Heart, in a word accompanying Might’s Patch Tuesday publication.

Microsoft’s agentic strategy contrasts with Anthropic and OpenAI, which have touted the bug-finding properties of their particular person Mythos and GPT 5.5 fashions, respectively. MDASH scored an 88.4% success price on the College of California-Berkeley developed CyberGym benchmark, a technique for testing AI talents on precise vulnerabilities from manufacturing software program. Mythos presently scores 83.1% and GPT 5.5 scores 81.8%. The scores are primarily based on self-reporting from firms.

Microsoft did not disclose what fashions it used nor who made them. It famously has had an in depth relationship with OpenAI, built-in GPT fashions throughout its merchandise. However that relationship has frayed and Microsoft has pressed improvement of its personal proprietary fashions, announcing in April three new “MAI” fashions, MAI-Transcribe-1, MAI-Voice-1 and MAI-Picture-2.

Kim touted the agentic strategy as superior since “no single mannequin is finest at each stage.” The brokers fulfilled completely different roles equivalent to “auditor,” “debater” and “prover.”

“We don’t count on one immediate to do all the things; we don’t count on one agent to acknowledge, validate and exploit a bug in a single move,” he mentioned. Disagreement between underlying fashions itself can act as a sign, he wrote. “When an auditor flags one thing as suspect and the debater can’t refute it, that discovering’s posterior credibility goes up,” he mentioned.

MDASH is barely being utilized internally by Microsoft engineers and examined by a “small set of consumers as a part of a restricted non-public preview.”

Microsoft talked about no plans of an upcoming public launch, positioning MDASH as a analysis and “production-grade protection at enterprise scale.”