Numerous reports today indicated a localized east coast performance event was occurring. Reports pointed to an attack on DYN. DYN provides DNS services. DNS (Domain Name Service) can be thought of as a phone book for the internet. Domain names (ie. www.mysite.com) get mapped to IP addresses (ie. 126.96.36.199). This is one of the basic technologies that make the internet work, if it fails a lot of people get very upset. For the most part DNS services are very fast and robust. Many services like DYN and CDNs (content delivery networks) use DNS very dynamically to route traffic to remote points of presence (for distributing load closer to end users in the case of a CDN) as well as managing DNS for company’s websites.
Our first indication of an issue was seen in the Dynatrace Internet Health Report with some latency and dropped packets coming from Level 3 (a major network and CDN provider). This dashboard is available to the public http://internethealthreport.com/ and shows peering relationships between major backbone network providers (this are the companies that move the lion’s share of traffic across the US).
Other non-Dynatrace tools indicated an issue was underway. Downdetector.com showed a Level 3 issue occurring in the North East US.
It’s ONLY with synthetic monitoring that one can quickly detect and diagnose DNS issues, both at an Internet-scale and for your individual website.
Dynatrace operates the largest network of monitoring agents in the world with dedicated hardware in major US cities and thousands of software agents running from end user machines in every state. The Dynatrace Synthetic Network Internet Health Map shows the issue impacting locations across the US East Coast.
Further investigation shows that issue was being seen in very long DNS Lookup times. Typically these DNS requests occur in milliseconds, however time banding indicates that these DNS requests are timing out. When this occurs typically there are retries after 2 seconds elapse. This time banding indicates an issue with DNS Health.
Here we can see the DNS Time increasing dramatically in certain cites and specific networks.
OK, So What? DNS times are slowing down.
For example, here is today’s traffic from a major US retailer. If you look at the Visits for today compared to the same time last week you can see that the site traffic is down considerably. When traffic is down conversion count also goes down (this means the amount of revenue being generated online decreases).
The team here at Dynatrace loves real user data, and our customers also love getting real user data. However, in an event like this when a DNS issue prevents traffic from reaching your site, real user data only indicate the absence of traffic and not the cause. That’s why Dynatrace provides the worlds most sophisticated Synthetic Monitoring Network which proactively test sites from end user locations for issues like this.
What can businesses do to prevent this?
- Be Proactive, aggressively manage your site with Synthetic Monitoring. Let the software robots make sure that your site can be reached from end user locations and notify you the moment something is amiss. The sooner you know of the issue the sooner you can deal with it.
- Have a DNS failover strategy, relying on a single DNS provider is a recipe for disaster. You should have relationships with multiple vendors to allow you to switch DNS routing as soon as an issue shows up.
- Don’t replace synthetics with real user monitoring, real user monitoring will only tell you the absence of traffic in a situation like this. Synthetics will provide more detail as to what is being seen from those end user locations. Remember that while you need Synthetics it cannot replace real user monitoring as that real user monitoring data will tell you how much the issue is costing your business. You need both!
Shawn White, our Vice President of Digital Experience & Cloud Operations said it best… “Its ONLY with synthetic monitoring that one can quickly detect and diagnose DNS issues, both at an Internet-scale and for your individual website.”
The team here at Dynatrace will continue to monitor the situation and provide updates.