The AWS Outage and Joint Commission: How Business Continuity Standards Just Got Harder

The AWS outage on October 20, 2025, exposed major weaknesses in healthcare IT systems and cloud dependency. The 15-hour disruption, caused by a DNS failure in AWS's US-EAST-1 region, impacted over 3,500 companies globally, with healthcare organizations hit hardest. Critical systems like EHRs and telehealth platforms went offline, disrupting patient care and operations. This event coincided with updated Joint Commission standards, which demand stricter business continuity plans, including better vendor risk management, redundancy strategies, and failover testing.

Key takeaways:

Healthcare's dependency on AWS: Single-region reliance led to widespread failures.
Vendor risks: Many lacked proper oversight of third-party dependencies.
New Joint Commission standards: Healthcare providers must improve resilience to meet stricter requirements.

The AWS outage is a warning for healthcare organizations to strengthen their systems, improve vendor oversight, and prepare for future disruptions. Building resilience is now critical for compliance and patient care.

AWS Outage Impact on Healthcare: Key Statistics and Compliance Requirements

How the AWS Outage Revealed Weaknesses in Healthcare IT

AWS

The AWS outage on October 20, 2025, shone a harsh light on critical issues within healthcare IT. A seemingly simple DNS failure spiraled into a massive disruption, exposing vulnerabilities in infrastructure and management practices across the sector.

Healthcare IT Failures During the Outage

On that day, a healthcare provider's patient portal and internal systems went offline, cutting off patients' ability to pay bills and freezing the organization's revenue flow. The root cause? An integrated payment vendor that relied entirely on AWS without having proper backup systems in place ^[3]. This failure wasn't just about money - it paralyzed operations, forcing a return to manual processes, delaying test results, and canceling non-emergency procedures. The incident highlighted how over-reliance on a single cloud provider can lead to widespread operational chaos.

Electronic Health Record (EHR) systems were also hit hard. With these systems down, healthcare providers struggled to maintain continuity, underscoring the risks of putting all their eggs in one digital basket.

Dependence on Single Cloud Providers

The outage revealed a major problem: too many healthcare organizations depended exclusively on AWS's US-East-1 region for critical services. This created a single point of failure that proved disastrous when the region experienced its fourth outage in five years. Key AWS services like DynamoDB, EC2, Lambda, IAM, and routing gateways were all affected by DNS resolution failures ^[2]^[5].

Many providers had assumed that AWS’s core services - such as DNS, identity management, and data storage - were failproof. But when DynamoDB’s DNS failed, it triggered a chain reaction, taking down essential systems like EC2 orchestration and network load balancers. Even clustered failover strategies couldn’t save the day, leaving organizations unable to recover, either automatically or manually.

"This AWS outage underscores systemic cloud services provider concentration risk." - CyberCube ^[7]

This heavy reliance on a single cloud provider also put healthcare organizations at odds with updated Joint Commission standards, which demand more robust business continuity plans.

Weaknesses in Vendor and Third-Party Risk Management

The outage didn’t just highlight issues with centralized cloud dependencies - it also exposed serious flaws in vendor management. For instance, the healthcare provider that experienced the payment system failure hadn’t properly evaluated its vendor’s reliance on AWS or ensured that adequate redundancies were in place ^[3]. Unfortunately, this lack of oversight wasn’t unique. Many organizations lacked ongoing monitoring of their third-party providers’ cybersecurity and operational resilience ^[8].

Adding to the frustration, standard Service Level Agreements (SLAs) offered little relief. The nominal service credits provided by AWS were no match for the financial and operational losses incurred. As Nixon Peabody explained:

"Customers affected by this week's disruption undoubtedly faced lost revenue, diminished loyalty, decreased site traffic, and service inability to end-users. For platforms unable to process transactions, financial firms with inaccessible accounts, or educational institutions whose students couldn't submit assignments, damages surely far exceeded standard service credits." - Nixon Peabody ^[6]

Another key issue was the assumption that DNS was a "solved problem." Many organizations relied entirely on AWS’s automated DNS systems, believing they were foolproof. When those systems failed, the damage spread across multiple infrastructure layers ^[2]. Without independent monitoring tools or external traffic management systems to detect and reroute around failures, organizations were left helpless, fully dependent on AWS’s failing APIs and consoles ^[2]. This lack of proactive vendor oversight now poses a direct challenge to meeting the stricter risk management standards set by the Joint Commission.

Updated Joint Commission Standards: New Requirements for Healthcare

Joint Commission

What Changed in Joint Commission Standards

The latest updates to Joint Commission standards require healthcare organizations to go beyond basic disaster recovery and ensure the continuous availability of critical services. While the specifics of these guidelines are still evolving, the focus has shifted to maintaining essential clinical systems during disruptions and integrating IT systems with third-party risk management. This means healthcare providers must thoroughly map their dependencies, identifying customer-facing applications, single points of failure, and hidden connections that could lead to widespread outages ^[5]^[6].

Additionally, third-party risk management now extends beyond cybersecurity, emphasizing overall business continuity and operational resilience. This includes addressing risks like vendor concentration, which can leave organizations vulnerable if a single provider experiences issues. The recent AWS outage has highlighted these challenges, revealing gaps in preparedness and underscoring the importance of these new requirements.

How the AWS Outage Tested These Requirements

The AWS outage on October 20, 2025, served as a wake-up call for many healthcare organizations. Despite having redundancy measures in place, several entities found their failover strategies inadequate for handling the scale of this disruption. The incident exposed weaknesses in traditional safeguards, demonstrating that they may fall short when dealing with large-scale cloud service failures ^[6]. This real-world test has made it clear that healthcare providers need to rethink their resilience strategies to meet the updated standards.

Compliance Challenges After the Outage

In the wake of the outage, healthcare organizations face mounting pressure to meet stricter business continuity requirements. The event not only disrupted operations but also highlighted pre-existing vulnerabilities, pushing providers to adopt more stringent measures.

To comply, healthcare organizations must now document vendor risks in greater detail, including the potential impact on patient care during service disruptions. They are also expected to conduct regular failover tests - such as chaos engineering drills and tabletop exercises - to ensure their systems can withstand real-world scenarios ^[5]^[6].

Moreover, standard cloud service agreements, which often offer minimal service credits, are proving inadequate to address the financial and operational losses from major outages. Providers are being encouraged to renegotiate these agreements so that Service Level Agreements (SLAs) better align with operational needs and offer meaningful remedies.

Developing comprehensive operational playbooks has also become essential. These playbooks should anticipate provider-side failures and include clear escalation paths, communication protocols, and criteria for activating failover procedures. This proactive approach is becoming a cornerstone of healthcare IT resilience, ensuring organizations are better equipped to handle future disruptions.

Building Resilience: A Framework for Joint Commission Compliance

Adapting to the updated Joint Commission standards requires healthcare organizations to rethink their approach to business continuity. Achieving resilience means identifying critical dependencies, strengthening vendor oversight, and implementing reliable redundancy measures.

Mapping Critical Services and Dependencies

Healthcare organizations must perform a Hazard Vulnerability Analysis (HVA) to address risks like cyberattacks. This assessment should focus on "life-critical" or "safety-critical" systems, such as electronic medical records (EHRs), pharmacy and lab platforms, radiology systems, patient identification tools, biomedical devices, and telehealth technologies. These systems should be prepared to operate offline for extended periods - up to four weeks or more ^[9]^[10].

"Organizations should be prepared to have life- and safety-critical technology offline for four weeks or longer" ^[10].

In addition to identifying critical systems, organizations need to uncover hidden interdependencies between cloud providers, vendors, and applications to prevent cascading failures. Forming a multidisciplinary downtime planning committee is key to this effort. This team should include representatives from IT, emergency management, operational leadership, HR, and external vendors to map and address these dependencies effectively ^[9]^[10].

Once these dependencies are mapped, the focus shifts to proactively managing vendor risks.

Improving Vendor Risk Management

Recent events have highlighted the dangers of relying too heavily on a single provider. A LinkedIn poll revealed that 80% of respondents anticipate more conversations about cloud concentration risks, yet few organizations have taken steps to ensure true independence ^[11]. Annual vendor assessments are no longer sufficient - continuous monitoring is now essential to track real-time changes in vendor risk, compliance, and resilience ^[5]^[8].

Vendor risk management begins with rigorous due diligence. Organizations should evaluate vendors' cybersecurity measures, classify them by risk, and review their data handling practices, certifications, and controls ^[8]. Tools like Censinet RiskOps™ simplify this process by automating evaluations, enabling ongoing monitoring, and centralizing risk-related tasks. With Censinet AI™, healthcare providers can quickly complete security questionnaires, summarize vendor evidence, and generate detailed risk reports.

Contracts also play a critical role in mitigating risks. Standard cloud agreements often fall short, offering minimal service credits that don’t adequately address the financial impact of outages - especially when the average downtime cost is $7,500 per minute ^[4]. Procurement and legal teams should revise contracts to assign clear accountability, define remediation timelines, and include clauses for meaningful financial compensation based on actual losses ^[5]^[6]^[8]. Additionally, organizations should test vendors’ recovery plans through tabletop exercises and simulated outages ^[5].

With vendor risks managed, the next step is ensuring systems remain operational by implementing redundancy and failover strategies.

Setting Up Redundancy and Failover Strategies

Redundancy planning requires understanding the shared responsibility model of cloud resilience. While cloud providers manage infrastructure, healthcare organizations are responsible for the resilience of their applications and architecture ^[6].

To minimize risks, organizations should consider multi-region architectures within their primary cloud provider to reduce the impact of regional outages ^[6]. For critical systems, using multiple providers for the same service can add an extra layer of protection, though this approach demands careful planning and system segmentation. Incident playbooks with clear escalation paths, communication protocols, and activation criteria are essential ^[6].

Healthcare organizations must also prepare for downtime by creating offline resources, such as paper forms that mirror electronic systems and can be accessed without network connectivity ^[10]. Training all staff - not just IT teams - on downtime procedures is vital, including clinical workflows for services that are harder to deliver without electronic systems ^[9]^[10]. After any major incident, conducting a post-mortem analysis helps identify successes and areas for improvement, ensuring resilience strategies evolve ^[9]^[10].

The stakes are high: ransomware attacks cost healthcare organizations an average of $1.9 million per day and can last 17 days ^[4]. Additionally, healthcare data breaches average $10.93 million, the highest across industries for 13 consecutive years ^[10]. Investing in redundancy and failover measures is not just about compliance - it’s about safeguarding patient care and minimizing operational disruptions.

Conclusion: Turning Lessons Into Action

The AWS outage on October 20, 2025 - marking the third major disruption in the US-EAST-1 region over five years - serves as a stark reminder that no infrastructure is failproof ^[1]. For healthcare organizations, the stakes couldn't be higher. With a median cost of $2 million per hour of downtime, the impact extends beyond finances, putting patient care directly at risk ^[12].

"It's not a question of if a service is going to go down. It's a question of when. Your job as a CIO is to manage that risk with the rest of your C-suite and to come up with a plan." - John Annand, Digital Infrastructure Practice Lead, Info-Tech Research Group ^[12]

This perspective underscores the critical need to integrate resilience into every aspect of operations. The vulnerabilities highlighted earlier point to one clear takeaway: healthcare organizations must adopt updated strategies to build systems that can withstand disruptions. Resilience shouldn't just be an afterthought or a box to check for regulatory compliance - it needs to be the cornerstone of protecting patient care and financial sustainability. The focus should shift from trying to avoid every outage to being prepared for when they inevitably occur ^[13].

Now is the time to strengthen your systems and ensure you're ready for the next challenge to your resilience.

FAQs

What steps can healthcare organizations take to reduce risks from relying on a single cloud provider?

Healthcare organizations can lower the risks of depending on just one cloud provider by embracing a multi-cloud strategy or setting up multi-region architectures. These methods help keep essential services running smoothly, even if a provider or specific region faces an outage.

To further boost resilience, organizations can implement redundant systems, carry out regular disaster recovery tests, and perform thorough risk assessments. These steps not only uncover potential weak spots but also ensure systems are equipped to manage disruptions, safeguarding both patient care and day-to-day operations.

What steps can healthcare organizations take to meet the updated Joint Commission business continuity standards?

To align with the updated Joint Commission standards, healthcare organizations need to bolster their ability to handle IT disruptions effectively. Start by putting in place solid backup and disaster recovery plans to keep essential systems and data accessible, even during outages. It's also crucial to perform regular risk assessments to uncover vulnerabilities and address threats before they escalate.

Using multi-cloud or hybrid cloud strategies can help reduce dependency on a single provider, adding an extra layer of system reliability. Implementing real-time monitoring ensures that issues are spotted and addressed quickly, while well-prepared incident response playbooks provide teams with clear guidance during emergencies. These measures not only support compliance but also protect patient care and maintain smooth operations.

Why is managing vendor risks essential for healthcare IT resilience?

Managing vendor risks plays a crucial role in maintaining healthcare IT resilience. It allows organizations to pinpoint and address potential weak spots tied to third-party providers. For instance, disruptions from cloud services can expose critical vulnerabilities within IT systems.

Taking a proactive approach to these risks ensures that patient care remains uninterrupted, sensitive data stays protected, and compliance with regulatory standards is upheld. Effective strategies include using multiple cloud providers to reduce dependency, conducting thorough risk assessments, and putting strong contingency plans in place to limit the fallout from unexpected disruptions.

The AWS Outage and Joint Commission: How Business Continuity Standards Just Got Harder

How the AWS Outage Revealed Weaknesses in Healthcare IT

Healthcare IT Failures During the Outage

Dependence on Single Cloud Providers

Weaknesses in Vendor and Third-Party Risk Management

Updated Joint Commission Standards: New Requirements for Healthcare

What Changed in Joint Commission Standards

How the AWS Outage Tested These Requirements

Compliance Challenges After the Outage

sbb-itb-535baee

Building Resilience: A Framework for Joint Commission Compliance

Mapping Critical Services and Dependencies

Improving Vendor Risk Management

Setting Up Redundancy and Failover Strategies

Conclusion: Turning Lessons Into Action

FAQs

What steps can healthcare organizations take to reduce risks from relying on a single cloud provider?

What steps can healthcare organizations take to meet the updated Joint Commission business continuity standards?

Why is managing vendor risks essential for healthcare IT resilience?

Related Blog Posts

Ready to See Censinet in Action?

Latest Perspectives from Censinet

Impact of Malware on Patient Safety: Case Studies

HIPAA Compliance in Cloud Disaster Recovery

Checklist for FDA SBOM Compliance in 2026

Ready to See
Censinet in Action?