This post was originally published on this site

What’s happening?

Numerous reports today indicated a localized east coast performance event was occurring.  Reports pointed to an attack on DYN.  DYN provides DNS services.  DNS (Domain Name Service) can be thought of as a phone book for the internet.  Domain names (ie. www.mysite.com) get mapped to IP addresses (ie. 128.122.1.101).  This is one of the basic technologies that make the internet work, if it fails a lot of people get very upset.  For the most part DNS services are very fast and robust.  Many services like DYN and CDNs (content delivery networks) use DNS very dynamically to route traffic to remote points of presence (for distributing load closer to end users in the case of a CDN) as well as managing DNS for company’s websites.

Our first indication of an issue was seen in the Dynatrace Internet Health Report with some latency and dropped packets coming from Level 3 (a major network and CDN provider).  This dashboard is available to the public http://internethealthreport.com/ and shows peering relationships between major backbone network providers (this are the companies that move the lion’s share of traffic across the US).

HealthReport

Other non-Dynatrace tools indicated an issue was underway.  Downdetector.com showed a Level 3 issue occurring in the North East US.

level3

It’s ONLY with synthetic monitoring that one can quickly detect and diagnose DNS issues, both at an Internet-scale and for your individual website.

Dynatrace operates the largest network of monitoring agents in the world with dedicated hardware in major US cities and thousands of software agents running from end user machines in every state.  The Dynatrace Synthetic Network Internet Health Map shows the issue impacting locations across the US East Coast.

healthmap_dns

Further investigation shows that issue was being seen in very long DNS Lookup times.  Typically these DNS requests occur in milliseconds, however time banding indicates that these DNS requests are timing out.  When this occurs typically there are retries after 2 seconds elapse.  This time banding indicates an issue with DNS Health.

DNS_benchmark

Here we can see the DNS Time increasing dramatically in certain cites and specific networks.

DNS_by_city

OK, So What? DNS times are slowing down.

Aside from the news reports, and issues accessing social media what is the big deal?  From what Dynatrace can see this problem is not isolated to social media sites, because most other sites use social media components on their web pages.  In some cases, they utilize services from those social media sites (like API’s and JavaScript frameworks) which can impact the performance of their own sites.

For example, here is today’s traffic from a major US retailer.  If you look at the Visits for today compared to the same time last week you can see that the site traffic is down considerably.  When traffic is down conversion count also goes down (this means the amount of revenue being generated online decreases).

UEM

The team here at Dynatrace loves real user data, and our customers also love getting real user data.  However, in an event like this when a DNS issue prevents traffic from reaching your site, real user data only indicate the absence of traffic and not the cause.  That’s why Dynatrace provides the worlds most sophisticated Synthetic Monitoring Network which proactively test sites from end user locations for issues like this.

What can businesses do to prevent this?

  • Be Proactive, aggressively manage your site with Synthetic Monitoring. Let the software robots make sure that your site can be reached from end user locations and notify you the moment something is amiss.  The sooner you know of the issue the sooner you can deal with it.
  • Have a DNS failover strategy, relying on a single DNS provider is a recipe for disaster. You should have relationships with multiple vendors to allow you to switch DNS routing as soon as an issue shows up.
  • Don’t replace synthetics with real user monitoring, real user monitoring will only tell you the absence of traffic in a situation like this. Synthetics will provide more detail as to what is being seen from those end user locations.  Remember that while you need Synthetics it cannot replace real user monitoring as that real user monitoring data will tell you how much the issue is costing your business.  You need both!

Shawn White, our Vice President of  Digital Experience & Cloud Operations said it best… “Its ONLY with synthetic monitoring that one can quickly detect and diagnose DNS issues, both at an Internet-scale and for your individual website.”

The team here at Dynatrace will continue to monitor the situation and provide updates.

About The Author
David Jones
David Jones Director, Sales Engineering. APM Evangelist David Jones is the Director of Sales Engineering and APM Evangelism for Dynatrace. He has been with Dynatrace for 10 years, and has 20 years’ experience working with web and mobile technologies from the first commercial HTML editor to the latest web delivery platforms and architectures. He has worked with scores of Fortune 500 organizations providing them the most recent industry best practices for web and mobile application delivery. Prior to Dynatrace he has worked at Gomez (Waltham), S1 Corp (Atlanta), Broadvision (Bay Area), Interleaf/Texcel (Waltham), i4i (Toronto) and SoftQuad (Toronto).