
The key lesson here is that risk mitigation can be incremental. You can move from single-region to replicated backups, from backups to warm standby, from warm standby to partial hot standby. Each step reduces exposure. Each step has a cost. Each step must be aligned with business appetite.
When we speak about geopolitical resilience, the topic can quickly become too abstract. “War.” “Politics.” “Instability.” These words are broad and can lead to emotional decisions rather than clear ones. For architecture discussions, we need something more structured.
In one company I am currently consulting, a significant part of infrastructure is located in Taiwan. Growing geopolitical tension in the region is openly discussed in global markets. Nothing has happened yet – no outage, no physical damage, no shutdown, but the probability of disruption is on the table.
Military conflicts, political instability, sanctions, state-ordered shutdowns, and physical attacks on infrastructure are no longer rare events. They are recurring realities. These risks can directly impact cloud regions, data centers, connectivity routes, and even entire regions.
This example shows something important – compliance requirements are not static. They evolve with the world around them. An architect who only follows today’s rules without thinking about how those rules might change is taking a hidden risk.
Political risk is state action. It includes internet shutdowns, forced traffic filtering, routing controls, sanctions, export restrictions, or sudden regulatory changes that restrict cross-border data transfer. Your infrastructure may still be running. But traffic is blocked or redirected. Or you are legally not allowed to operate as before. I always clarify one point to executives – compliance risk and geopolitical risk are connected. A regulatory move can have the same operational impact as a technical outage.
Introduction
Then there are state shutdowns. In Egypt in 2011, during mass protests, the government ordered a near-total internet shutdown. Around 88% of the country’s internet disappeared from global routing tables. Large BGP route withdrawals were observed. From an application perspective, the entire country went offline. Multi-AZ or even multi-region inside one country does not protect you from decisions made at the national level.
For users, these events look exactly like cloud outages. They don’t see whether the service is unavailable because of a hardware failure, a cable cut in a conflict zone, or a government decision. They simply see that the service is down. As architects, we need to understand this difference because the recovery approach is completely different for each cause.
But today the environment has changed.
I use a simple three-category model – military risk, political risk, and hybrid risk. This is a practical tool I use during architecture reviews.
Why Geopolitical Resilience Is Now an Architecture Requirement
The first is physical damage to facilities. Data centers are physical buildings. They depend on power, cooling, staff access, and physical security. If a facility is damaged or access is restricted, even the most well-designed distributed system inside that region can be impacted. A cloud region is still a physical place.
The next level is cross-region resilience. Here, we store backups in another region, ideally in a different geopolitical zone. We run recoverability drills – real restore tests. I have seen many teams say “we have backups,” but when they test restore, they discover hidden issues – missing encryption keys, wrong permissions, outdated infrastructure code. Immutable backups become very important here, especially if ransomware or sabotage happens during a conflict. Cross-region recovery often takes longer than teams expect. This pattern reduces region-level risk but increases operational complexity. It requires discipline in testing.
At this point, many leaders ask me a direct question – “So what should we build?”
From my experience, this changes how we must approach architecture decisions. In this article, I want to explore architecture from a slightly different angle – one that is very practical and very relevant in our unstable world.
Compliance is a forcing function. It forces leadership to take digital resilience seriously and forces architecture teams to move from assumptions to evidence. But the Ukraine example also reminds us to look further: the regulations you follow today may change, and your architecture needs to be ready for that too.
Two days later, the weight of every priority changed dramatically.
Hybrid risk sits between the two. It includes cable cuts in conflict areas, cyber campaigns timed with political events, supply chain restrictions, or sanctions that indirectly affect hardware delivery and maintenance. These events are often not officially declared, but the effect is real – latency increases, spare parts are delayed, security incidents increase during political tension. Hybrid risk is dangerous because it does not look like a total outage. It looks like instability, performance degradation, or partial failure. And partial failures are harder to detect and explain.
The goal is to make risk visible, measurable, and aligned with the actual business appetite.
When I started my career, geopolitical events were treated as rare black swans. Architects focused on hardware failure, software bugs, human mistakes, and traffic spikes. If we designed for multi-AZ deployment and had backups in another region, we felt safe.
Real Events That Architects Cannot Ignore
When I joined Raiffeisen Bank Ukraine as Technical Lead of the Core Banking Platform, the plan was clear – long-term modernization, and gradual migration to the cloud. We had a roadmap planned across a few years, with careful evaluation of each component, proof-of-concept phases, and a strong focus on protecting customers from any unpredictable disruption of critical banking services.
Let me walk through the menu of resilience patterns from minimal to advanced.
We worked sometimes 24 hours without breaks – not because of a missed deadline or a product launch, but because the alternative was losing everything.
But over the last decade, a clear pattern has emerged. Internet shutdowns. Subsea cable cuts. Military operations near critical infrastructure. Drone attacks affecting data centers. Sanctions and regulatory changes that restrict operations in specific countries.
Military risk is physical. It includes physical damage to data centers, loss of power or cooling, blocked access to facilities, or scenarios where more than one facility in the same geographic area is affected at the same time. For architects, the key question is – what happens if this physical location becomes unavailable for days or weeks? Not just one availability zone, but the entire physical area.
As architects and technology leaders, our task is to assess risk exposure. So, the discussion is framed in neutral, professional language – what is the probability of large-scale disruption in this region within the next three to five years? What is the potential business impact if connectivity or facilities become unavailable? How dependent are we on this single geography?
A Simple Way to Think About Risk Categories
Speed of migration became the number one priority above everything else – above cost, above stability, above the modernization plan itself. The reason was simple and urgent – at any moment, we could face exposure of data centers physically located in Ukraine. The carefully planned multi-year roadmap became irrelevant overnight. What mattered was moving the most critical components out of physical risk as fast as possible.
This leads to an important clarification that I always raise in architecture reviews – region resilience or availability zone resilience is not the same as geopolitical resilience. Multi-AZ protects you from rack failure or localized incidents. Multi-region protects you from region-wide failure. But neither automatically protects you from country-level shutdowns, sanctions, or large-scale military operations. If all your regions are inside the same geopolitical risk zone, you may still have a single point of failure.
For many years, architecture decisions were driven by business strategy, technical constraints, cost optimization, and many other factors. We built distributed systems to serve users faster and reduce operational risk. Multi-AZ deployment was considered enough. Backups in another region felt like a solid safety net.
This approach also gives leadership a better set of choices. Instead of a binary decision – “stay or exit”, you offer a spectrum of options. You reduce concentration risk without destabilizing operations. And you avoid reactive decisions under pressure.
The third is state actions. Governments can enforce shutdowns, traffic filtering, or routing changes. We have seen cases where almost an entire country disappeared from global routing tables. From an application point of view, it looks like a total outage, even when your infrastructure is technically healthy.
Different industries and regions have their own regulatory frameworks that directly affect architecture decisions. In Europe, DORA sets formal resilience requirements for financial institutions and their ICT providers – covering risk management, incident reporting, mandatory resilience testing, and third-party vendor risk. GDPR affects where data can be stored and transferred across borders. In the United States, frameworks like SOC 2, HIPAA for healthcare, and FedRAMP for government workloads all carry architecture implications. In other regions, data sovereignty laws increasingly restrict cross-border data flows in ways that directly limit your infrastructure options.
From my perspective, there are three modern failure modes that we must consider in architecture discussions.
Finding the Balance Between Resilience and Complexity
You do not need a war to justify preparation. You need awareness that stability is not guaranteed.
The most practical steps any team can take today are – build a dependency inventory before a crisis, run real restore tests not documentation drills, define executive thresholds for action, and treat geopolitical context as a first-class design input alongside performance, cost, and compliance.
I have seen organizations create large documentation packages – detailed risk registers, complex presentations, but with no real restore testing, no dependency maps, no validated runbooks. That is not resilience. That is paperwork.
The company has decided not to wait for a crisis. We are implementing gradual workload redistribution – not in panic, not overnight, but step by step.
These are not isolated events anymore.
Now let’s look at connectivity chokepoints. Red Sea undersea fiber cuts in 2024 and 2025 impacted major cable systems connecting Europe, Asia, and the Middle East. Microsoft warned Azure users about increased latency. Traffic was rerouted. The cloud regions themselves were operational, but connectivity paths were degraded. This is the key lesson – even if your compute and storage are healthy, your users may not reach them normally. Network topology matters. Geographic distribution must consider cable routes, not only cloud region maps.
The last option is multi-cloud or cloud-exit posture. This is usually considered when the risk is at the provider level or country level, for example, when sanctions or geopolitical shifts could affect access to a specific cloud provider. Multi-cloud is not simply “deploy the same workload twice.” It means duplicated skills, duplicated automation, duplicated observability, and often duplicated integration logic. It is expensive and increases the cognitive load on teams. That is why I say clearly – multi-cloud is often a board-level decision, not an architecture hobby. If leadership decides that provider concentration risk is unacceptable, then we design for it. But architects should not introduce multi-cloud just because it is trending.
Until now, we discussed geopolitical resilience as a strategic choice. But for many organizations, it is no longer only a choice. It is a regulatory requirement.
Case Study: Banking Under War Conditions
A good example is Ukraine. Before the invasion, the National Bank of Ukraine required all banking data to be stored exclusively inside Ukraine. Using cloud infrastructure outside the country was not allowed. From a compliance perspective, the architecture was correct. From a geopolitical resilience perspective, it was a single point of failure. When the invasion started, this regulation was changed. Cloud usage outside Ukraine became permitted, because survival of the financial system mattered more than data residency rules.
In Ukraine, significant connectivity disruptions were tracked across multiple regions from the first day of the invasion. Ukrtelecom, the national provider, suffered a major disruption linked to physical damage to fiber infrastructure. This was not a theoretical risk. It became our operational reality overnight.
Each architecture pattern reduces certain risks and increases others. The role of an architect is to present options clearly, explain trade-offs, and align architecture with real business needs.
My answer is always the same – it depends on your business outcomes, risk appetite, and budget. There is no universal correct pattern. There is a menu of options. You choose based on reality.
Architecture is no longer only about technology. It is about understanding where your technology physically and politically lives.
Geopolitical resilience is not about building the most complex system. It is about choosing the right level of protection for the real level of risk. Multi-AZ and multi-region are not the same as geopolitical resilience – if all your infrastructure is in the same political zone, you may still have a single point of failure.
Case Study: Taiwan Risk and Gradual Workload Redistribution
What actually matters in a crisis are disaster recovery test results, restore reports showing that backups were usable, clear operational runbooks for failover, and dependency maps showing which systems rely on which infrastructure and providers.
Real incidents from Egypt, Ukraine, the Red Sea, and the Middle East show that these risks affect real users, real banking systems, and real cloud services. What looks like a cloud outage to your users may be caused by a government decision or a cable cut thousands of kilometers away.
The second is connectivity chokepoints. Subsea cables and regional backbone networks are critical paths. Even if your cloud region is operational, traffic may be rerouted or degraded if key cables are damaged. Latency increases. Packet loss increases. Some regions become partially isolated.
When evaluating these risks, I focus on four simple factors. First, probability – how likely is this scenario in the next one to three years? Second, impact – what is the business consequence if it happens – revenue loss, regulatory breach, reputation damage? Third, realistic recovery time – actual recovery time if borders are closed, staff cannot reach offices, or network routes are unstable. Fourth, blast radius – does the event affect only one region, or multiple facilities in the same geopolitical zone?
The next option is multi-region active/active. This is powerful but expensive – not only in money, but in complexity. Active/active means data consistency challenges, latency trade-offs, careful design of distributed transactions or eventual consistency, and conflict resolution logic. I call this the “blast radius of complexity.” When we increase resilience, we also increase operational surface area. This pattern requires operational maturity. Monitoring must be mature. Incident response must be clear. Teams must understand distributed system behavior. In my experience, many organizations want active/active because it sounds safe, but not all of them are ready to operate it safely. This pattern must be justified by real business need, not by fear.
Compliance as a Forcing Function
And here is something interesting – compliance and geopolitical risk pull in the same direction, but sometimes they pull against each other first.
The architectural implications across all of these are similar. Backup and restore is no longer a slide in a presentation. It requires evidence – real restore results with timestamps, duration, and issues found. Vendor concentration risk must be analyzed and documented. Exit planning must exist, even if it is never used. It means documented feasibility, data portability, contract clauses, and migration scenarios.
Before we look at frameworks and patterns, I want to ground this in reality. These are real events from recent years. For each one, the question is simple – what broke, and what should architects learn?
I also have direct personal experience with this type of risk. I became Technical Lead of the Core Banking Platform at Raiffeisen Bank Ukraine just two days before the invasion started in February 2022. What began as a long-term modernization plan became an urgent mission – migrate the most critical parts of the bank’s infrastructure, including sensitive client and bank data, into the Cloud as fast as possible. This was no longer about efficiency or cost. It was about protecting the infrastructure from physical harm.
The baseline is same-region resilience. This means multi-AZ distribution inside one cloud region. For many companies, this is already a strong improvement over single-zone deployment. But we must be honest about what it does not cover. Multi-AZ protects from technical failure inside a region. It does not protect from country-level disruption or regional connectivity isolation. This is a baseline, not a final answer.
By Oleksiy Pototskyy
At the beginning of many projects, I see requirements to build ideal, enterprise-grade, highly available systems. In today’s world, we can technically build everything. But it will cost a lot. My approach is to classify requirements into clear categories – must have, nice to have, not really necessary. After priorities are rearranged, we collect dependencies, estimate work in hours and weeks, align with real team roadmaps, and translate everything into money. At this stage, many features are usually removed from the board and we get a more realistic picture.
I cannot share all the details publicly, but the core lesson I want to pass on is this – priorities are not absolute. There are no universal patterns or metrics that fit every situation. The right architecture decision depends completely on the conditions at the moment you are making it. What is the correct approach in a stable environment may be the wrong approach under crisis conditions, and vice versa.
Let’s start with physical attacks. In the recent Middle East conflict, drone strikes damaged AWS data center facilities in the United Arab Emirates and Bahrain. The impact was described as localized, but recovery was prolonged. This was not a software bug or a cloud misconfiguration. It was physical damage to infrastructure. If your critical workload depends on a single geopolitical area, multi-AZ inside that area may not be enough.
This is something architects must always keep in mind. We tend to build frameworks, best practices, and reference architectures. They are useful. But they are not the answer in every situation. The real skill is knowing when the conditions have changed and having the clarity to change your approach with them.






