Infrastructure · JUNE 7, 2026
Apple licenses a 1.2T-parameter Gemini MoE for Siri, runs it on B200s inside Private Cloud Compute
Bloomberg, TechTimes and Google Cloud's own CEO line up the same architecture ahead of Monday's WWDC keynote: a custom mixture-of-experts Gemini, ~$1B/year, weights sitting on Nvidia B200s inside Apple-controlled enclaves.
Apple has licensed a custom ~1.2-trillion-parameter mixture-of-experts Gemini model from Google for roughly $1 billion a year, and will serve the weights itself, on Nvidia B200 chips, inside its own Private Cloud Compute enclaves. Bloomberg's Mark Gurman reported the architecture this weekend; TechTimes filled in the silicon; Google Cloud CEO Thomas Kurian confirmed the partnership on stage at Google Cloud Next '26. Tim Cook walks out at 10 a.m. PT Monday to introduce it as the spine of a rebuilt Siri.
This is the deal Apple spent two years insisting it would never need to do.
The structural read is straightforward. Apple's on-device Foundation Models, announced at WWDC 2025 and gated to the iPhone 15 Pro and up, handle latency-sensitive and offline work. Anything heavier now routes to a Google frontier model that Apple rents but physically controls. Gene Munster of Deepwater Asset Management puts the multi-year value at up to $5 billion. The recipient is the default assistant on roughly a billion devices.
The hosting topology is the part worth staring at. Google's weights, on Nvidia's hardware, inside Apple's hardware-isolated cloud, with an ACM conference paper landing in June 2026 to bless the privacy claims. Each of the three companies gets what it actually wanted: Google books the largest external Gemini contract in its history, Nvidia sells B200s into a customer that has spent a decade publicly preferring its own silicon, and Apple gets to tell users that no prompt ever leaves an Apple-controlled boundary. The price of that story is a billion dollars a year and the admission that the in-house model wasn't going to close the gap in time.
Inside Cupertino the politics track the architecture. Mike Rockwell has taken command of Siri engineering; John Giannandrea, who ran AI through the Apple Intelligence rollout, is out of the seat. Cook steps down September 1. The keynote Monday is the last one he delivers as CEO, and the headline product is a Google model.
The optionality is real but constrained. Apple's Extensions API will let users designate Claude or ChatGPT as the default engine behind Writing Tools and Image Playground. Gemini is what ships in the box. iOS 27 drops the iPhone 11; macOS 27 drops the 2019 16-inch MacBook Pro, the 2020 13-inch MacBook Pro, the 2020 27-inch iMac and the 2019 Mac Pro, with Tom's Guide noting the Neural Engine absence as the cited reason. The hardware floor rises because the software stack now assumes it.
For two years the open-weights side argued that mixture-of-experts was the structural trade that lets you ship trillion-parameter capability at sub-trillion serving cost. Apple just signed the largest check in consumer AI to agree with them, on someone else's weights.
Sources
- https://www.bloomberg.com/news/newsletters/2026-06-07/wwdc-2026-apple-s-secret-meeting-that-led-it-to-take-ai-seriously-ios-27
- https://techcrunch.com/2026/06/06/what-to-expect-from-wwdc-2026-siris-highly-anticipated-revamp-and-apple-intelligence-updates/
- https://gizmodo.com/what-were-expecting-at-apple-wwdc-2026-2000768245
- https://www.tomsguide.com/news/live/wwdc-2026-live-news-updates
- https://www.techtimes.com/articles/317902/20260606/wwdc-2026-opens-monday-gemini-powers-rebuilt-siri-iphone-11-faces-ios-27-cut.htm