AWS Certified Solutions Architect - Professional 2020

Sign Up Free or Log In to participate!

outage, how to architect for Well Architected Frame work (Reliability, Availability, resiliency) if Route53 is down?

Yesterday Route53 suffered several hours outage, and it was reported in many news media. e.g.

How would one design around that given that route53 was the most reliable service with 100% availability? use DNS caches by increasing TTLs? any comments or insights?

2 Answers

Well DDoS on DNS is a very big deal as they are highly redundant, and especially if it is as high profile as AWS. But as we can see it happened. When you are the king of the mountain its a hacker challenge/adventure.

Several mitigations can be done – longer TTLs, multiple DNS Name Servers, Multiple DNS Providers

Walid Shaari

Thank you Sam

Route 53 doesn’t support AXFR / zone transfer, so multiple DNS providers is out when using Route 53 as primary. In a true enterprise environment, consider not using Route 53, at least as a primary DNS service.

The environment I currently work in uses an enterprise DNS service for primary, and we do host some subdomains in Route 53, which makes separation of teams and departments easier, as different departments have their own AWS accounts. Also worth considering in a true enterprise environment, Route 53 does not currently support DNSSEC. That might not be a big deal to most people, but in our environment is was the main deal breaker.

Walid Shaari

Thank you David for your insights.

Tom Kringstad

Good post David. AWS are pretty explicit about this in their documentation as well:

Kirk Rohani

Thanks David for the insightful answer. I was not yet aware of DNSSEC until your posted answer after which I researched further and learned a lot. Thanks again!

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?