#cloudnativewisdom014: AWS US-EAST-1 Outage Explained | DailyDevLists

Loading video player...

#cloudnativewisdom014: AWS US-EAST-1 Outage Explained

Cloud Native Podcast

141 days ago

15:31

Platform Engineering & DevOps Culture

Rank #1

Description

𝗖𝗹𝗼𝘂𝗱𝗡𝗮𝘁𝗶𝘃𝗲 𝗪𝗶𝘀𝗱𝗼𝗺 — 𝗔𝗪𝗦 𝗨𝗦-𝗘𝗔𝗦𝗧-𝟭 𝗢𝘂𝘁𝗮𝗴𝗲 𝗘𝘅𝗽𝗹𝗮𝗶𝗻𝗲𝗱 With Richard Simon (CloudTherapist) & Saim — short forensic recap and lessons for architects, platform engineers and SREs. In this episode, we rewind to October 20, 2025 and unpack the major AWS us-east-1 outage: what failed, why DynamoDB (and DNS) were central to the cascade, how AWS responded, and the practical resilience trade-offs every engineering leader should consider. We dig past the headlines and give actionable guidance for teams hit by cloud outages. 𝗖𝗵𝗮𝗽𝘁𝗲𝗿𝘀 / 𝗧𝗶𝗺𝗲𝗰𝗼𝗱𝗲𝘀 0:00 Intro 0:31 Why DynamoDB mattered: service dependencies & ripple effects. 1:02 DNS resolution failure & retry feedback loop 2:02 Richard Simon: explaining the “enactors” and the race condition. 3:58 How a second enactor overwrote partial DNS state — the cascade begins. 5:28 AWS response, mitigations and what they proposed to fix automation. 7:44 Community reactions 10:19 Multi-region HA vs cost and practical options. 13:34 Expectations for AWS re: Invent; what to watch for. 14:45 Closing 𝗟𝗶𝗻𝗸𝘀 & 𝘀𝗼𝘂𝗿𝗰𝗲𝘀 - Official AWS Health/Incident page (US-EAST-1 Oct 20, 2025): https://aws.amazon.com/message/101925/ - Reuters summary & timeline: https://www.reuters.com/business/retail-consumer/amazons-cloud-unit-reports-outage-several-websites-down-2025-10-20/?utm_source=chatgpt.com - The Verge analysis / AWS post-mortem coverage: https://www.theverge.com/news/805904/amazon-breaks-down-the-dynamodb-dns-problem-that-took-down-aws-on-monday?utm_source=chatgpt.com - ThousandEyes / technical writeups & postmortems: https://www.thousandeyes.com/blog/aws-outage-analysis-october-20-2025 If this episode helped you think differently about resilience, please like, subscribe, and leave a comment with your outage story — Richard and I may reach out and feature it on a future show. Want to collaborate/share an incident? Drop a link in the comments or DM us.

Watch on YouTube

Video Details

Category

Platform Engineering & DevOps Culture

Featured Date

January 26, 2026

Quality Rank

#1

AI Recommended