The confusion is understandable
When an engineering team says "we want to be multi-cloud," they almost always mean one of two different things — and they usually don't know which.
The first thing they might mean: "We want resilience. If AWS goes down, we want to keep running." That's a high availability problem. The right solution is multi-region, not multi-cloud.
The second thing they might mean: "We don't want to be locked into one vendor." That's a commercial and strategic concern. Multi-cloud can address it — but at a significant operational cost.
Conflating the two leads to architectures that are expensive, complex, and still don't actually achieve what the team wanted.
What multi-region solves
Multi-region means running your workloads across two or more geographic regions within a single cloud provider — for example, AWS us-east-1 and eu-west-1.
This protects you against the most common causes of serious outages: regional network failures, availability zone degradation, and localised infrastructure incidents. AWS, Azure, and GCP each have well-documented SLAs for cross-region redundancy, and the tooling to implement it (Route 53 health checks, Azure Traffic Manager, GCP global load balancing) is mature and well-understood.
For most teams asking about "multi-cloud for resilience," multi-region on a single provider is the correct answer. It is simpler, cheaper, and more reliable than the multi-cloud alternative.
What multi-cloud actually means
Multi-cloud means your workloads run across two or more cloud providers simultaneously — for example, some services on AWS and others on Azure.
This is architecturally demanding. You now have:
- Two sets of IAM and access control systems to manage
- Two billing models, two cost optimisation strategies
- Two observability stacks, or a third-party solution that abstracts both
- Two sets of provider-specific services your team needs to understand
- Terraform (or Pulumi) configurations that must work across both provider APIs
- Deployment pipelines that target multiple environments with different auth models
None of this is insurmountable. But it is real overhead, and it compounds as your system grows.
When multi-cloud genuinely makes sense
There are legitimate reasons to run multi-cloud — but they're more specific than "resilience":
Regulatory requirements. Some industries require data to be processed or stored with specific providers in specific regions. A single cloud may not cover all the jurisdictions you need.
Best-of-breed services. Azure's Active Directory integration is genuinely superior for enterprises already on Microsoft. GCP's BigQuery is meaningfully better for certain analytics workloads. If specific services on specific clouds give you real advantages, the overhead may be worth it.
Avoiding commercial lock-in at scale. If you're spending $2M+ per year on a single cloud provider, the negotiating position you gain by being able to migrate is worth something. Below that threshold, the engineering cost outweighs the leverage.
Acquisition and merger scenarios. You acquired a company that runs on a different cloud. Multi-cloud is now your reality whether you planned for it or not.
The honest operational cost
Before committing to multi-cloud, run this exercise: count the number of cloud-specific managed services you currently use. RDS, EKS, Lambda, S3 event triggers, SQS, SNS, CloudFront. Every one of those is a service you'll either need to replace with a cloud-agnostic alternative or duplicate on the second provider.
Teams that do this exercise honestly often discover they are far more deeply embedded in their current provider than they realised. That's not a criticism — using managed services is the right call in most cases. It just means the cost of going multi-cloud is higher than expected.
What we recommend
Start with the question: what failure scenario are you actually protecting against?
If the answer is "a region going down," implement multi-region on your current provider. It will take a fraction of the time and cost, and it will solve the actual problem.
If the answer is "vendor lock-in" or "regulatory requirements" or "specific services on a different cloud," then multi-cloud may be the right path. Go in with a clear inventory of what you're moving, a realistic estimate of the migration and ongoing operational cost, and a specific list of what you're gaining in return.
Architecture decisions made out of vague anxiety tend to create complexity without creating safety. Be specific about the risk you're solving for, and the right architecture usually becomes obvious.