Transfin.
HomeNewsGuidesReadsPodcastsTRANSFIN. EOD
  1. News
  2. Explained

What Caused the Fastly Internet Outage? What are CDNs and Why are they Important?

Editor, TRANSFIN.
Jun 14, 2021 11:57 AM 4 min read

Several major websites experienced outages last week.

Visitors to popular news media and social media sites were greeted with messages like “Error 503 service unavailable” or “connection failure”.

However, the problems were identified swiftly and largely fixed in about an hour.

The cause of the outage was a failure in a Content Delivery Network (CDN) run by American cloud computing services provider Fastly.

Which Platforms Were Affected?

They included social media sites (Reddit, Spotify, Pinterest, Quora, Vimeo), publishers (Guardian, NYT, CNN, BBC, Verge, Financial Times), streaming services (Twitch, Hulu) and other major pages like the UK government website and Amazon.com.

Some services on other platforms were affected too. For example, users experienced difficulties in using Twitter's emoji feature or were unable to view images on the microblogging site for some time. There were also reports that people were finding it difficult to book slots for COVID-19 vaccinations in the UK.

The outage was not geographically universal. Users in Berlin, for example, reported that they could still access the affected sites.

FYI: News sites accommodated to the situation accordingly. The Guardian began a live blog on Twitter while The Verge issued updates via a Google Doc. (The latter seemed like a nice idea until someone accidentally tweeted an editable link to the doc, allowing anyone online to make changes!)

Now, coming to what caused the disturbance in the force...

 

What is a CDN?

It's a geographically distributed group of proxy servers and data centers that work together to ensure faster delivery of internet content to the end user.

The idea is that since the source of internet content and the destination are often thousands of kilometres apart, there's a need to store this content geographically closer to the user so that content delivery is faster and more efficient.

Another way to look at this: Without a CDN, a request to access an NYT article in London would have to travel across the Atlantic Ocean to the NYT's web server in New York and then travel all the way back, leading to a noticeable lag. What happens instead, however, is that a CDN in London stores a version of the NYT website in the UK, thereby allowing the request to be answered faster.

Essentially, CDNs sit between websites and users, ferrying content and requests for content to and fro. In a world without CDNs, a reader in India would take longer to access a BBC article than, say, one in London.

FYI: CDNs also play a key role in load balancing by diverting traffic to other servers in times of heightened demand to ease strain and maintain free flow of data, in addition to protecting websites from denial-of-service attacks.

 

The Ides of June

Now, Fastly has said that yesterday’s global outage was “due to an undiscovered software bug that surfaced on June 8th when it was triggered by a valid customer configuration change”.

This bug, the company said, was inadvertently introduced by a software deployment by the company on May 12th.

BTW: Following the outage, #cyberattack began trending on Twitter. However, there is no evidence indicating there was any foul play involved.

FYI: While several data-heavy services were affected yesterday, it is interesting to note that Netflix largely stayed put despite commanding meaningful global traffic. While the most intuitive reasoning one can think of is that Fastly is not their CDN provider (which is true), there’s more to this. Netflix actually does not rely on any CDN provider. The streaming giant actually initially started with Akamai, Level 3 and Limelight as its CDN partners but later built out its own CDN to have full control over content delivery and user experience. Here’s what Netflix has to say about their “in-house” CDN called Netflix Open Connect and here’s a technical (and beautiful) explainer on the same.

 

How Could This Happen?

Considering the vast, interconnected and complicated nature of the internet today, glitches in the matrix are unavoidable.

Internet outages themselves are not uncommon (albeit, their frequency has reduced in recent years thanks to advanced data centres and improved performance). Remember when Gmail, Maps and YouTube crashed for about an hour in December? Or when Slack briefly crashed in January? Outages occur every day (here’s a live map), and most are localised and fixed quickly.

Speaking of CDN failures in particular, there have been several such occurrences in the past. A problem at Amazon AWS’s servers in 2017 brought down some of the biggest websites for several hours across the entire US East Coast. Last year, a bug in Cloudflare sparked a half-hour outage for most of the internet across Europe and the Americas.

That said, the over-concentration of the CDN market in the hands of a few big companies poses a threat that almost guarantees that such incidents will continue to happen in the future.

As of 2020, 89% of the CDN market was commanded by a grand total of three companies - Cloudflare, AWS and Akamai (Fastly was #5). This means that even a minor glitch in any of these platforms will have a domino effect, cascading to vast swathes of the net and affecting hundreds of millions of users across the world.

Basically, unless something is done to dilute this gross concentration risk, this handful of big companies will continue to play Fast-ly and loose with the internet.. Unless you are Netflix ofcourse…

FIN.
 

Congratulations! You've made it to the end. Looking for more takes on Business, Finance, Markets and Investing? Subscribe to TRANSFIN. E-O-D for informative and insightful daily news updates