In Today’s AI World, Infrastructure Proactive Planning is Critical.

Written by Chris Kirk | May 22, 2026 10:29:21 AM

Capacity, Redundancy, and Risk Reduction in cloud: a practitioner's perspective

Today’s Datacentre Challenges

I’ll start with a rather blunt, one liner: on-premises infrastructure gets a bad reputation in cloud-first conversations. It's framed as legacy, the thing you're considering migrating away from with a lifetime of technical debt. That framing is wrong, and it leads to poor architectural decisions.

On-premises compute has genuine, enduring strengths, especially when it comes to data sovereignty, predictable capital costs, ultra-low latency for workloads that demand it, and full control over the hardware stack. For certain regulated industries, certain workload profiles, and certain cost structures, running your own infrastructure isn't a throwback to the past, it's the right answer for today in these scenarios.

The honest challenge with on-premises isn't that it's inferior, it's that it's capacity-constrained by design, because you can only run what you've bought. And right now, buying is harder and more expensive than it has been in years.

What began as a broad disruption across automotive, consumer electronics, and enterprise hardware has matured into something more structural and more concentrated: a severe shortage of the advanced AI-grade silicon and high-bandwidth memory (HBM) that modern infrastructure depends on. Samsung's memory chief warned as recently as April 2026 that significant shortages across memory products are expected to continue through at least 2027. Dell has been blunter still with their assessment in that there’s no meaningful relief expected until 2028.

The numbers behind that assessment deserve full attention. DRAM prices climbed over 300% in 2025 as data centre demand consumed around half of global memory production, up from 32% just five years earlier. DDR5 RDIMM costs in particular are projected to surge a further 100% across 2026. TSMC's 2nm fabrication capacity, which produces the most advanced AI chips on the market, is fully booked through 2028. Nvidia's latest GPU generations carry wait-lists stretching well beyond a year, with hyperscalers such as Microsoft, Google, Meta, who are dominating the allocation queue ahead of enterprise customers.

If you're planning to expand your on-premises AI or compute capacity right now, you're competing in that queue, at those prices, with those lead times. That's not a reason to abandon on-premises. It is, however, a compelling reason to be strategic about what you run there, and to have cloud available as a genuine, tested, elastic extension of your estate.

Cloud Regional Capacity Is Not Infinite, And This Isn’t a Secret

This is no secret – Azure at regional levels are subject to the same supply chain physics as everyone else. Behind resources, subscriptions, it’s like any other datacentre; racks, hosts, and power limits in physical buildings - and Microsoft is buying chips from the same constrained global supply as you are.

The difference however is that the hyperscaler’s do have a structural advantage: purchasing power. Microsoft, Google, and Meta are committing close to $700 billion combined in capital expenditure in 2026, the majority for AI infrastructure. They can secure allocation that smaller buyers cannot. But even that firepower has limits. TSMC's 3nm node - which powers today's most advanced AI chips including Nvidia's latest generations - has been running above 100% utilisation, with maintenance being deferred to sustain output.

The consequence for enterprise customers is real and immediate. Certain niche Azure VM SKUs - particularly GPU-accelerated instances - carry wait-lists or are quota-restricted in specific regions. Industry analysts project that 30 to 50% of planned 2026 data centre capacity will slip to 2028, driven by a combination of chip constraints, power grid connection delays, and raw material shortages including specialised gases essential to semiconductor fabrication. If your cloud strategy assumes unlimited, instant access to high-performance compute in your preferred region, the current reality is more complicated than that.

The point isn't to undermine confidence in Azure - it remains the most capable and mature enterprise cloud platform available, and Microsoft's investment commitments are genuine. The point is that cloud capacity should be understood, planned for, and architected accordingly - not assumed. Which brings us to what a well-designed hybrid estate actually looks like.

The Hybrid Model: On-Premises Where It Makes Sense, Multi-Region High- Availability Cloud Where It Doesn't

The right architecture isn't cloud-first or on-premises-first. It's workload-first. That means making a deliberate decision for each class of workload: what belongs on infrastructure you own, and what belongs on infrastructure you rent elastically.

On-premises remains the right home for workloads with consistent, predictable demand profiles; data that carries sovereignty or compliance constraints; latency-sensitive processing that cannot tolerate a network hop; and compute that you've already invested in and is running efficiently.

Cloud - specifically Azure - earns its place as the elastic layer: the capacity you reach for when demand spikes, when you need new workloads without that significant capital commitment, when you require geographic reach your datacentre can't provide, or when you need access to AI-grade GPU compute that is simply unavailable to procure on-premises right now. Given that the hardware shortage is expected to persist through 2027 and possibly 2028, cloud access to GPU capacity for many organisations can be the only viable path to running AI workloads at meaningful scale in the near term.

The critical discipline however is the bursting path that must be designed, architected and tested before you need it. Organisations that treat cloud as an overflow in an emergency can run into trouble, like encountering quota limits at exactly the wrong moment, and operate without the governance controls that make cloud economically rational. Build the integration intentionally. Know your quotas, and validate periodically.

Matching Service Tier to Risk Tolerance

Not all workloads carry the same risk profile, and not all Azure services provide the same resilience guarantees. The mistake I see repeatedly is organisations applying a single redundancy model across their entire cloud estate - either over-engineering commodity workloads or, more dangerously, under-engineering critical ones.

Consider the three layers of Azure redundancy and how your services should align to them:

Availability Zones (AZs): Physically separate datacentres within a single Azure region, connected by low-latency, high-bandwidth fibre. Zone-redundant deployments protect against datacentre-level failure. This is the baseline for anything business-critical. Azure services including Azure SQL, Storage, and Service Bus support zone redundancy, use it by default for production workloads, not as an optional extra.
Region Pairs: Microsoft pairs Azure regions geographically and sequences platform updates to ensure both regions in a pair are not updated simultaneously. Cross-region replication - using services like Azure Site Recovery, geo-redundant storage, or Fabric's built-in replication - should be reserved for workloads where regional failure is an acceptable risk to plan for, not just a theoretical one. For high availability and critical workloads, consider cross-region load balancing for end nodes, so that active-active cross-region architectures exist.
Platform-managed resilience in PaaS: Services like Microsoft Fabric and Azure AI Foundry abstract much of the infrastructure concern away - but they introduce a different risk: platform dependency. If Microsoft Fabric capacity in your region is constrained or a service update introduces instability, your mitigation options are limited compared to IaaS. Understand the shared responsibility model deeply for every PaaS service you adopt. Know what Microsoft owns and what you own.
Distributing across multiple EU regions for GDPR-compliant workloads, selecting pairs that keep data within the EU geography while providing meaningful geographic separation. France Central, Germany West Central, Sweden Central, Poland Central, and Italy North all qualify and provide genuine architectural diversity.
Reviewing latency assumptions. The assumption that European regions outside the traditional West/North Europe pairing carry unacceptable latency is often wrong, particularly for Nordic and Central European locations which sit closer to modern Azure capacity investments than Ireland or the Netherlands.
For AI services such as AI Foundry workloads specifically, understanding that Data Zone deployments, which enforce EU residency for prompt and response processing, currently anchor to Sweden Central and Germany West Central as the primary EU data zone regions. Architecting across both with graceful degradation handles region-level incidents without requiring a compliance breach to maintain availability.

Europe Has More Azure Regions Than Most Customers Realise

Today, Microsoft’s partial answer to regional constraints? More regional choices. Azure's European footprint has expanded dramatically, but most customers are architecting for regional services utilised pre-pandemic.

This old mental model was simple but limited: North Europe (Ireland) or West Europe (Netherlands), with UK South and UK West for customers who needed in-country residency. Four regions, two obvious pairs. That model is now significantly out of date.

Azure currently operates or has recently opened regions across France (Paris), Germany (Frankfurt), Sweden, Norway, Poland, Italy (Milan), Denmark (Copenhagen), Austria (Vienna), Belgium (Brussels), Spain, Finland, and Switzerland in addition to the original European and UK locations. That's over twenty European regions, many with full Availability Zone support.

This expansion matters for redundancy in ways that go beyond simple disaster recovery. More regions mean more architectural options for decentralising workloads, reducing concentration risk, improving end-user latency, and satisfying data sovereignty requirements across different jurisdictions. Customers who are still pinning everything to West Europe and UK South are leaving resilience and performance on the table.

Sweden Central is a good example worth calling out specifically. It has matured quietly from a regional option into a genuinely capable tier-1 Azure region supporting Availability Zones, a broad PaaS service catalogue including AKS, Azure SQL, Cosmos DB, Event Hubs, and Azure OpenAI, and a strong compliance posture covering ISO 27001, SOC 1/2/3, and the EU attestations that procurement teams increasingly require. For UK-based customers, Sweden Central sits at roughly 20 to 30 milliseconds round-trip latency, still well within average latency tolerances, which makes it a realistic secondary or even co-primary location, not just a theoretical DR target.

Region	Estimated Latency back to UK
UK West	5 ms
West Europe	8 ms
North Europe	10 ms
France Central	15 ms
Germany West Central	20 ms
Switzerland North	25 ms
Sweden Central	30 ms
Norway East	38 ms

The broader European region expansion creates a genuine opportunity to rethink how workloads are distributed. Rather than a primary/DR pair selected years ago and never revisited, consider:

The Strategic Takeaway

With GPU and memory shortages expected to persist through 2027 and beyond, it's more important than ever to plan both sides of a hybrid estate deliberately.

In a well-designed hybrid estate, it's about knowing which workloads belong where and having the elastic connection to cloud capacity that lets you grow without waiting for hardware.

The hardware shortage is real, it's current, and the analysts aren't expecting meaningful relief before 2028. That's not a reason to panic. It's a reason to plan and to make sure that your on-premises estate and your Azure footprint are designed to complement each other.

To learn more about this topic and your architectural options, watch our webinar on demand, “Reducing Risk in the Cloud: Designing for Resilience, Availability and Change”, covering all of the above in depth and how Trustmarque Ultima are helping our customers every day across their hybrid cloud estates.

About the author

Chris Kirk | Hybrid Cloud & Data Professional Services Director, Trustmarque Ultima

Chris is responsible for Trustmarque Ultima delivery teams and talking to customers on their strategy across cloud, data & AI for Azure, Microsoft Fabric, and AI Foundry, with roots in on-premises data centre design in a career spanning 20 years.

View full post