Sometimes utilities go down…

Business, Design, IT, PaaS, SaaS, Strategy

The blogosphere got in major panic this week when Amazon’s S3 online storage service went down for a number of hours. It seems there was around six hours of outage that had life threatening consequences – yes, worse than hospital power supplies being wiped out, worse than widespread transportation gridlock – the little avatars that Twitter users have to animate their identities did not show up!

Sacre bleu!

I mean people let’s get some balance here – sure online storage systems are becoming more and more important but I reiterate – they’re not generally a matter of life and death. SmugMug, an Amazon S3 user shows a sense of balance when they say;

Our faith in Amazon, and the care they take of your priceless memories, hasn’t been shaken. Your photos and videos are safe – which is our #1 concern. Since problems in this industry are inevitable, and Amazon’s performance over the last two years has been so exceptional, we’ve been afraid an outage like this. I’m sure there will be more over the next few years, too.

The important thing is that they’re few and far between, short, and handled properly. Every component SmugMug has ever used, whether it’s networking providers, datacenter providers, software, servers, storage, or even people, has let us down at one point or another.

Or pretty much "we do what we can but sometimes utilities go down – get over it".

We’ve heard for a few years about the move to utility computing – when storage and access will become just another utility like water and electricity – so we should treat it as such. In a former life I worked on emergency power supplies for critical business installations – generally they’ll have multiple levels of redundancy – backup feeds, UPS systems, generators and the like. How is it that when we move to a utility model of data storage we seem to think it acceptable to rely on (usually) a single pipe to a single box?

Phil posts with some sense about the requirement for multiple redundancies, Loic posts about choosing top shelf options that have better guarantees in place. Steve Clayton tries (not overly successfully) not to take a swipe at his employer’s competitor but makes some salient point about the changes that will happen in the coming move to the clouds;

some big names will lose along the way – or get eaten up
lawsuits will happen as SLA’s are promised and not met
the backlash as cloud computing goes over the hype curve is inevitable
a few small players will make big, big investments that are needed to make the cloud work

We live in times of disruption. By its very nature – disruption entails a degree of uncertainty and flux – cloud compute outages are an embodiment of this flux.

2 Comments

Glen Barnes | July 22, 2008 at 9:08 am

The thing is I have had more downtime caused by in-house IT systems than by network provided systems over the years, Having your ‘file server’ go down once a year for 3 hours seems pretty good to me. I can’t count the number of times I would arrive at the office (it didn’t matter what company I was working at) and something would not be working. It would be another wasted hour waiting for the IT department to reboot the server or fix the email.

Lets face it. Amazon still do a better job at keeping things running than most IT departments.

Yves Hiernaux | July 22, 2008 at 4:45 pm

Following the same idea than Glen, my biggest disappointment until now has been to not find in any of the big blogs any sort of comparison chart between cloud computing, non-cloud-computing, in house IT, …

Numbers are certainly hard to get because of the granularity of each failure for non-cloud and it is obvious that when it comes to cloud computing, each failure is massively observed, still… shouting in the wind doesn’t help !

Sometimes utilities go down…

2 Comments

Leave a ReplyCancel reply