Backup and recovery: quantifying the cost of doing nothing

John Leonard on the art of selling a strategic approach to enterprise backup and recovery to the board

Selling a negative is never as easy as pushing a positive. This is very much the case with the long-time Cinderella of the IT world, backup and recovery, where the chief benefit is the ability to maintain rather than enhance the bottom line. When it comes to funding, poor Cinderella always seems to lose out to her pushier siblings.

Added to the image problem faced by backup and recovery, there is no easy way to quantify the financial benefits of investing in it. There are too many uncertainties in predicting the future, analysing the probabilities of various possible calamities occurring and quantifying the potential cost of doing nothing to mitigate them.

Given these difficulties, when IT heads meet the board to discuss priorities, those advocating a more integrated approach to backup and recovery will struggle to make their voices heard.

The result is that backup and recovery tends to be tacked on as an afterthought to other projects whose financial benefits are easier to calculate and which seem to be driving the business towards a brave new tomorrow.

This second-class status means that over time a hotch-potch of disconnected backup and recovery solutions evolves, each focused on a particular application or platform, and each with its own dependencies, support requirements and deficiencies when it comes to supporting newer technologies, such as virtualisation, mobile and cloud.

And since backup and recovery solutions tend to remain in place for many years, this effectively condemns the organisation to a patchy and incomplete safeguard against unexpected events.

Tales of the unexpected

A recent Computing survey of 120 IT professionals at medium to large organisations sought to find out how those charged with keeping enterprise IT systems operational quantify risk and, more importantly, how they use that information to justify spending on backup and recovery.

Unsurprisingly all claimed to plan for obvious things such as fire and flood, along with power cuts, hardware and software failures, malware attacks and so on. When asked for specific instances, however, the responses were a lot less predictable.

From a pack of kippers left in a duct by a vengeful air con supplier to an operator vomiting over a rack of switches and bringing down an entire network as a result of finding a decomposing and very smelly hedgehog in the works, bad smells are something unlikely to feature in many disaster recovery plans.

Nor is it easy to foresee complex, multi-layered scenarios, such as a dedicated business resiliency centre being cordoned off by the police as the result of an explosion nearby. The risk of human error is also notoriously hard to quantify: in one firm, although tapes had been religiously loaded into the backup library every night, the backup routine itself had been de-scheduled. This was not discovered until staff had to recover a crucial storage array and found the backups were months out of date.

In order for all eventualities - from acts of God to rotten hedgehogs - to be adequately covered without breaking the bank, an enterprise-wide audit of assets, categorisation by importance and then some sort of cost-benefit analysis over their long-term safeguarding would seem to be in order. But wouldn't this be expensive?

Backup and recovery: quantifying the cost of doing nothing

John Leonard on the art of selling a strategic approach to enterprise backup and recovery to the board

While 64 per cent of survey respondents claimed to have a strategic plan and overall budget shared across the whole company, 11 per cent said that backup and recovery tools were purchased purely on a project-by-project basis, while a further seven per cent rely on the tools bundled with their platforms.

Going into more detail, the survey found that just 43 per cent had fully integrated backup and recovery systems able to protect all of their platforms and their line-of-business applications and to provide that protection across both real and virtualised infrastructures (figure 1).

[Click to enlarge]

Just under a fifth (17 per cent) said they rely on being able to recover whole servers rather than being able to bring specific applications back online should problems arise. Unfortunately, so-called bare metal recovery of complete server platforms can take a long time.

Indeed, slow and complex recovery topped the list of complaints about legacy backup systems. Compatibility and integration issues, licensing, and support and training overhead came, next in that order.

Time is money

Since slow recovery times are clearly the bane of IT departments, it begs the question as to why those same departments don't do some relatively simple maths and use those very same lengthy recovery estimates to quantify the financial impact on the business of not investing in solutions to address the issue. This would certainly help them to sell an integrated strategy to the board without having to go through a complex cost-benefit analysis.

A server down or an application offline means lost business and the longer it takes to recover the more business will be lost. The cost of that lost business can be guessed at, if not accurately estimated, and used when bidding for the purchase of new backup and recovery solutions or when updating an existing setup to cope with technological and business changes.

And resisting the drift towards tacking backup and recovery onto new projects, IT heads could draw the attention of fund-holders to another part of the survey.

Rather than new challenges such as BYOD, storage growth, compliance and security - issues that have plagued IT departments for decades - continue to be those that create the greatest pressure on backup and recovery systems (figure 2).

[Click to enlarge]

While new technologies and ways of working need to be included in the calculations, figure 2 suggests that organisations should not be distracted by the noise around these innovations. Simply adding recovery to the BYOD project will do nothing to alleviate the main pressures.

Fifty-four per cent of respondents identified classification and prioritisation of data as being a challenge. This shows, at the very least, that many organisations are aware that the most effective backup and recovery strategy has to start with an understanding of the resources that are to be kept up and running.

Resources such as servers, applications and data that would cause the biggest operational harm should they become unavailable for any length of time need to be afforded the highest levels of protection against downtime.

By classifying data and other resources this way IT managers can then concentrate any investment in backup and recovery on making sure those resources are adequately protected, by making sure that extra redundancy is built into host platforms, networking and WAN services to keep business-critical systems running.

Classification of data to better concentrate backup resources would seem to be something all companies should do. Unfortunately that doesn't appear to be universally understood as, when asked whether they classified data resources by importance to the business, only 68 per cent of respondents to the Computing survey answered "Yes".

While a proper cost-benefit analysis may be all but impossible in this context, failure to put any numbers at all on key resources will certainly make the job of selling a strategic approach to the board all the more difficult.

@ComputingJohn