SpiderLabs Blog

No, the Internet Does Not ‘Just Work’

Written by Space Rogue | Sep 12, 2012 12:48:00 PM

The recent GoDaddy DNS outage illustrates that the Internet does not just work and sometimes stuff still breaks just for the sake of breaking and not because it was 'attacked'.

The parts of the Internet that just work are exceedingly fragile and prone to failure. Those failures keep tens of thousands(perhaps hundreds of thousands?) of people employed worldwide. They keep numerous large businesses in business. If things just worked there would be a lot more people collecting public assistance.

Unfortunately most of these people work behind the scenes in server rooms and telecom closets tucked away in the back recesses and basements of buildings everywhere. If you actually see them during your workday you know something has gone terribly terribly wrong. And you know what, most of them like it that way. There is a certain satisfaction in knowing that you are the person responsible for maintaining the architecture that lets everyone else get their job done. If it's your job you don't want to be visible, you let your network uptime speak for itself.

But in these days of iPhones, and talking cars, and automatic toll collection that all just works it is easy to forget about the infrastructure that sits behind it all and makes it appear, to you, to just work. So when it doesn't work people automatically jump to the conclusion that something exceedingly bad has happened and these days with all the cyber warparanoia going around the conclusion is often some sort of 'cyber war attack'. Of course it doesn't help when a few completely unreliable, unconfirmed, and completely unrelated people start taking credit in under 140 characters simply for the 'lulz'.

People want to believe the outlandish and sensational. They want to believe that an amazing super hacking power has descended upon the networks. I mean the Internet just works right? So it must be something pretty disastrous to make it not work. People want to believe this so much that even when a real, official, confirmed explanation comes out they still want to believe the outlandish crazy one.

Sometimes a simple maintenance or upgrade or other change can greatly impact your production environment. Something simple like adding a new VLAN can aggravate an underlying design defect that went unnoticed before. With so many different routers, OSs, commands and syntaxes it is easy to simply issue a wrong command or have a typo that causes a cascade failure across multiple nodes on your network. In large complex networks this can be an absolute nightmare to diagnose and troubleshoot.

This is why you have contingency plans. This is why you have test and production servers. This is why you have a back-out plan. Even with all this redundancy bad things can still happen, because, well, bad things sometimes just happen. The Internet doesn't just work. It is an incredibly fragile complicated Rube Goldberg contraption that if you breathe on it wrong will fall over. No attack necessary.