Key Takeaways
iPhone chips have apparently gotten so powerful that they can now run small AI models directly on the phone. To that, Apple says the cloud is now optional.
At WWDC — Apple’s annual developer conference that’s happening this week — the company announced its freshly expanded Foundation Models framework. It’ll let iOS and macOS apps run AI inference directly on the device, with no API key, CDN, or per-token cost.
Adding AI to apps means sending a request to a cloud-hosted model (the likes of OpenAI, Anthropic, Google, whoever) and the answer comes back. But now, Apple is saying there’s no reason to do it this way anymore because the model is already on the phone.
Plus, what Apple is offering is really tempting. Devs with less than 2 million first-time App Store downloads get free access to the cloud-tier of the next-gen Foundation Models.
Always Read the Fine Print
What you may not have known is that Apple’s most powerful cloud model runs on Google’s servers and was trained using Google’s own AI.
Apple signed a multi-year deal for that arrangement, reportedly around $1 billion a year. So the “on-device AI” angle isn’t quite the whole story, because what Apple is actually describing is a three-tier system: The device handles what it can, Apple’s Private Cloud Compute handles what’s next (which is notably not on-device), and then Google Cloud handles the rest.
And the tier-routing is automatic — developers, hosts, nor users choose which tier handles a request.
“We believe privacy in AI is non-negotiable,” Craig Federighi said at the keynote, adding that “data is only used to execute your request, and outside experts can continue to verify this promise at any time.”
While Apple is reportedly keeping the “Private Cloud Compute” name, Google’s infra is also part of that tier. So, yeah: When Apple says private, they mean private from you, not private from Google.
Where Does Hosting Fit?
This obviously isn’t the first hosting-adjacent partnership that raises a couple of questions.
Look at the Broadcom/VMware acquisition that occurred in 2023. Companies that had been using VMware for years one day woke up to pricing increases of 800%, 1,000%, even 1,500%. Broadcom also replaced existing reseller agreements with an invite-only partner program.
VMware Licensing Costs Exploded After Broadcom's Acquisition
Source: Software Guide
None of its users actually had the choice to agree to these terms until it was time to resign a contract, and there really weren’t that many alternative options, so people felt stuck, to say the least.
Apple and Google’s arrangement isn’t quite the same but it’s still a story of two companies deciding where user data goes and once again, hosting providers are seen as irrelevant in this decision for now.
But it looks like there’s an opening coming: In addition to its Foundation Models framework announcement, Apple also confirmed that it will go open source later this summer.
If — that's a big "if" right now because the details are scarce — that means developers can run Apple’s models on their own servers. And then those AI inference workloads have to live somewhere, right?
That’s a hosting problem, but luckily, hosting providers are very good at hosting problems.
