Reporting of “Cloud” Failures

12 Oct

I’ve been reading an article from Michael Krigsman today related to Virgin Blue’s “cloud” failure in Australia along with a response from Bob Warfield.  These articles raised the question in passing of whether such offerings can really be called cloud offerings and also brought back the whole issue of ‘private clouds’ and their potentially improper use as a source of FUD and protectionism.

Navitaire essentially seem to have been hosting an instance of their single-tenancy system in what appears to be positioned as a ‘private cloud’.  As other people have pointed out, if this was a true multi-tenant cloud offering then everyone would have been affected and not just a single customer.  Presumably then – as a private cloud offering – this is more secure, more reliable, has service levels you can bet the business on and won’t go down.  Although looking at these reports it seems like it does, sometimes.

Now I have no doubt that Navitaire are a competent, professional and committed organisation who are proud of the service they offer.  As a result I’m not really holding them up particularly as an example of bad operational practice but rather to highlight widespread current practices of repositioning ‘legacy’ offerings as ‘private cloud’ and the way in which this affects customers and the reporting of failures.

Many providers whose software or platform is not multi-tenant are aggressively positioning their offering as ‘private cloud’ both as an attempt to maintain revenues for their legacy systems and a slightly cynical way to press on companies’ worries about sharing.  Such providers are usually traditional software or managed service providers who have no multi-tenant expertise or assets; as a result they try to brand things cloud whilst really just delivering old software in an old hosted model.  Whilst there is still potentially a viable market in this space – i.e. moving single-tenant legacy applications from on-premise to off-premise as a way of reducing the costs of what you already have and increasing focus on core business – such offerings are really just managed services and not cloud offerings.  The ‘private’ positioning is a sweet spot for these people, however, as it simultaneously allows them to avoid the significant investment required to recreate their offerings as true cloud services, prolongs their existing business models and plays on customers uncertainty about security and other issues.  Whilst I understand the need to protect revenue at companies involved in such ‘cloud washing’ – and thus would stop short of calling these practices cynical – it illustrates that customers do need to be aware of the underlying architecture of offerings (as Phil Wainwright correctly argued).  In reality most current ‘private cloud’ offerings are not going to deliver the levels of reliability, configurability and scale that customers associate with the promise of the cloud.  And that’s before we even get to the more business transformational issues of connectivity and specialisation.

Looking at these kinds of offerings we can see why single-tenant software and private infrastructure provided separately for each customer (or indeed internally) is more likely to suffer a large scale failure of the kind experienced by Virgin Blue.  Essentially developing truly resilient and failure optimised solutions for the cloud needs to address every level of the offering stack and realistically requires a complete re-write of software, deep integration with the underlying infrastructure and expert operations who understand the whole service intimately.  This is obviously cost prohibitive without the ability to share a solution across multiple customers (remember that cloud != infrastructure and that you must design an integrated infrastructure, software and operations platform that inherently understands the structure of systems and deals with failures across all levels in an intelligent way).  Furthermore even if cost was not a consideration, without re-development the individual parts that make up such ‘private’ solutions (i.e. infrastructure, software and operations) were not optimised from the beginning to operate seamlessly together in a cloud environment and can be difficult to keep aligned and manage as a whole.  As a result it’s really just putting lipstick on a pig and making the best of an architecture that combines components that were never meant to be consumed in this way.

However much positioning companies try to do it’s plain that you can’t get away from the fact that ultimately multi-tenancy at every level of a completely integrated technology stack will be a pre-requisite for operating reliable, scalable, configurable and cost effective cloud solutions.  As a result – and in defiance of the claims – the lack of multi-tenant architectures at the heart of most offerings currently positioned as ‘private cloud’ (both hardware and software related, internal and external) probably makes them less secure, less reliable, less cost effective and less configurable (i.e. able to meet a business need) than their ‘public’ (i.e. new) counterparts.

In defiance of the current mass of positioning and marketing to the contrary, then, it could be suggested that companies like Virgin Blue would be less likely to suffer catastrophic failures in future if they seek out real, multi-tenant cloud services that share resources and thus have far greater resilience than those that have to accommodate the cost profiles of serving individual tenants using repainted legacy technologies.  This whole episode thus appears to be a failure of the notion that you can rebrand managed services as ‘private cloud’ rather than a failure of an actual cloud service.

Most ironically of all the headlines incorrectly proclaiming such episodes as failures of cloud systems will fuel fear within many organisations and make them even more likely to fall victim to the FUD from disingenuous vendors and IT departments around ‘private cloud’.  In reality failures such as the case discussed may just prove that ‘private cloud’ offerings create exposure to far greater risk than adopting real cloud services due to the incompatibility of architecting for high scale and failure tolerance across a complete stack at the same time as architecting for the cost constraints of a single tenant.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: