Item: Microsoft
Author: Karl Strauchman

At Build 2026 in San Francisco, Microsoft shipped seven first-party MAI models, and the one that matters is MAI-Thinking-1: a 35B-active mixture-of-experts reasoner with a 256K context window that posts 97% on AIME 2025 and 53% on SWE-Bench Pro. The keynote framed the family as a portfolio. The benchmark sheet framed it as a divorce filing.

The 53% on SWE-Bench Pro is the number to sit with. That's the same coding benchmark Anthropic's Claude Opus 4.6 anchors, and Microsoft is putting a mid-sized in-house model in striking distance of it while explicitly noting the weights weren't distilled from OpenAI. "This is all about long term self-sufficiency for Microsoft and our partners. It's about models you can trust," said Mustafa Suleyman, CEO of Microsoft AI. The provenance line is doing real work: it's the argument enterprise procurement teams have been waiting for.

The rest of the family fills out the stack. MAI-Code-1-Flash, a 5B-parameter model in roughly the Claude Haiku size class, lands at 51% on SWE-Bench Pro and ships today as the default in VS Code and GitHub Copilot CLI. MAI-Image-2.5 scores 1403±9 on the Arena Image Edit leaderboard as of June 2, above Gemini 3 Pro Image Preview 2K at 1388±3, and sits at #3 on text-to-image and #2 on image-to-image. It's live in PowerPoint today, rolling out to OneDrive. MAI-Transcribe-1.5 claims state-of-the-art word error rate on FLEURS across 43 languages at a 5× speed advantage. MAI-Voice-2 covers 15 languages, with a Flash variant forthcoming.

The hardware story rhymes with the model story. Satya Nadella cited a 30% generational improvement on Maia 200, Microsoft's accelerator, with a 1.4× performance-per-watt gain running the MAI stack end-to-end versus the Nvidia GB200 cluster. CNBC reported the MAI models surpassed GPT-5.5 on McKinsey's internal benchmarks at "a tenfold reduction in costs."

That cost claim is where the strategy becomes legible. Microsoft spent the last four years as OpenAI's primary distribution channel and its largest customer; the MAI launch reframes it as a hyperscaler with a full first-party stack, distributed through Azure AI Foundry and externally via OpenRouter, Fireworks, and Baseten. The OpenAI partnership isn't ending. It's being made optional.

There's a structural precedent worth naming: Apple's 2020 transition off Intel silicon, where a multi-decade partner became, over two product cycles, a vendor whose absence the platform was engineered to survive. Microsoft isn't there yet. But shipping seven models on the same day, including a reasoner trained from scratch and a coding model that's already the VS Code default, is the move you make when you no longer want a single supplier to be load-bearing. The provenance pitch is the soft version of that argument. The benchmark sheet is the hard one.

Microsoft ships seven MAI models at Build, puts a 35B-active reasoner alongside Opus 4.6 on SWE-Bench Pro

Sources