Key Takeaways
There was a time when AI was something companies were experimenting with and showing off with genuine excitement: a demo here, a chatbot there. But it looks like that era is over.
“Businesses are very quickly turning AI from an interesting bar trick into mission critical,” said Mike Campbell, CEO of Fusion Risk Management.
And once AI becomes mission critical, it inherits all the same issues that other similar systems have: What happens when it goes down?
Think about it. In those early days, AI was sitting on top of systems, not embedded inside of them. Agents are now handling customer interactions, processing transactions, even making logistics decisions.
And unlike a website outage — where you see the 404 right away and can at least redirect traffic — a problem with the AI can look a lot more like a slow burn.
#1: Your AI Agents Have a Single Point of Failure, Too
Cloud teams have been asking the same question for years: What happens when your provider goes down? AI is forcing that question back to the surface.
Cloud resilience teams in enterprise tech have been asking the same questions for years: What platform are you running on? What happens if it goes down? Do you have backups…or are you just waiting for it to come back?
These talks almost write themselves, especially with the handful of outages the internet experienced late last year. First AWS went down, then Cloudflare followed, and just when it looked like things were stabilizing, another major provider like Azure or GCP completely stumbled.
With AI, and now agentic AI, Campbell says we’re basically adding another thing to that existing problem.

“The agents are operating on very specific platforms,” Campbell said. “You have, in essence, the same discussion with additional players.”
Businesses that have already integrated AI agents into their operations — customer service, logistics, internal tooling, you name it — are now realizing that those agents have to come from somewhere. They run on specific infrastructure and are connected to specific providers.
If that provider goes down — outage, maintenance, whatever — your agents go down with it.
One example Campbell raised that doesn’t get talked about enough is that data centers are purposely holding AI performance back because they need that energy to run their massive cooling systems. “Self-imposed degradation of performance,” is how he put it.
The question businesses are only starting to ask is whether their AI setup can fail over to something else, the same way they’d want their databases or their website infrastructure to. For most, Campbell says the honest answer is no.
#2: Redundancy Just Got Easier, and That Changes Things
Single points of failure are so last year.
But Campbell says this is where things get a little ironic: AI is the new thing that needs to be made redundant, right? But it’s also the thing that’s making redundancy dramatically cheaper to build.
He said the cost of supporting multiple platforms within his won company, Fusion, has dropped by something in the range of 99% compared to six months ago. The reason: AI can now handle the translation work between systems that used to require a lot more effort and know-how. Moving flows and functionalities between platforms is something you could essentially hand off to AI agents.

“Having a set of agents spun up and being your primary mechanism on one platform, OpenAI spinning off these minimum viable components of my agentic answer over to Anthropic — very straightforward,” he said.
The tooling problem that made multi-platform redundancy so expensive is starting to fade. You no longer need a massive enterprise budget to run across multiple environments anymore. And since it’s cheaper and easier to do, customers are going to begin expecting it, which in turn puts pressure on the whole idea of vendor lock-in.
Locking customers in made sense when switching was painful and expensive. But if moving between platforms gets easier, well, what’s the point of discouraging migration?
Customers will be free to come and go as they please, and hosting providers are going to have to give them a reason to stay without using vendor lock-in or long-term contracts as the goalkeep.
#3: Platforms That Concentrate Risk Are Running Out of Time
For anyone in the cloud/hosting industry listening, Campbell had some pointed advice: If you’ve become so essential to businesses that your downtime creates problems, Big Brother is going to start paying attention to you.
“When you become, as a platform, so critical to so many people who are also critical to the operating of an economy, that’s when you start becoming in the crosshairs of the regulators,” he said.
The U.K. banking sector is already there. Regulators have been fairly explicit that depending on a single cloud provider and just waiting for it to come back online is not an acceptable resilience strategy for a bank.

Campbell’s argument is that platforms have a choice: Get ahead of it by actively helping customers build resilience plans and reducing concentration risk, or wait for the regulators to force the issue on their own time.
“They can help spread that risk or they can become a concentrator of that risk,” he said. “Concentrating it is going to become more and more visible — not just in individual conversations, but as a much more important focal point for government agencies, for regulators.”
It’s an interesting point for hosting providers in particular. The businesses that are figuring out resilience are going to make redundancy easier, right? Becoming a part of that solution — rather than a bottleneck — is probably the better place to be.




