Enterprises urged to use cloud failover during disasters

DataGardens CTO describes the benefits of context-aware process protection

At the Business Continuity Management Conference in London yesterday, there was a great deal of discussion around how IT managers should best protect their infrastructure in the event of a natural disaster.

One speaker at the conference, Dr Geoffery Hayward, CTO of software vendor DataGardens, argued that the cloud and virtualisation can help enterprises limit the disadvantages associated with using either application or infrastructure protection, the traditional ways of insuring against such disasters.

Application protection, which is used to minimise downtime and data loss, focuses on the software. So, for example, a company would have an application running in its live production site, but also have it running in a standby protection site.

The problem is that this approach focuses on specific applications, meaning it doesn't easily cover a variety of applications in an enterprise. It is also costly because it means running several live sites at the same time.

"There are a lot of costs associated with application protection," said Hayward.

"You effectively double the cost of having one production site, as your protection site is always active whether you use it or not. This sucks up memory and resources."

A second traditional approach is infrastructure protection, which is an application-agnostic, enterprise-wide solution. This focuses on servers, network and storage.

"The benefits of infrastructure protection are that it relies on a passive-protection site, so you don't fire it up until you need it. The data isn't moved across until disaster strikes," said Hayward.

"However, the recovery procedures are complex, brittle and need to be tested regularly. The process of shutting down the production site, but at the same time firing up the passive site and transferring everything over, is difficult to get right," he added.

"For example, you have to be very careful to reconfigure all the network settings correctly."

However, a third alternative to these options has emerged in the past couple of years, owing to the advancement of virtualisation – and this new type of protection is called process protection.

This is where the actual business processes are protected, as opposed to the infrastructure or applications, which means it protects what is going on 'in-memory' with the computing functions.

Hayward explained that to do this you have to protect the 'active state', which he described as the memory of the computers and the goings on in the processes themselves.

Enterprises urged to use cloud failover during disasters

DataGardens CTO describes the benefits of context-aware process protection

This has been made possible due to virtualisation, which enables companies to get access to what is going on in memory through the process of abstracting the virtual from the physical machine.

"That level of abstraction allows us to transfer live processes between servers," explained Hayward.

"Process protection allows companies to evacuate during the failover or even before if there is advanced warning, which means zero downtime and data loss through most disasters. It is also agnostic to both application and infrastructure," he added.

With process protection the active state – the disk, memory and processor state – is fired across from a company's production site to its protection site in near real time. It bundles the disk, memory and processor state together, and syncs it in the protection site constantly.

This means that if a production site goes down, the protection site already has all the processes available in it, without any need to migrate.

"Although the transfer would be seamless, there are still limitations," said Hayward.

"Both systems have to be active all the time, which again means there is a large cost involved. You are transferring this memory and processing from one site to the other all the time, and this is expensive," he added.

"There is also an extra strain on the networks to synchronise the memory in real time with no lag. This means the production and protection site cannot be more than 100km apart."

Hayward believes most of these problems have now been solved by the cloud, which has allowed for a new protection process called context-aware process protection (CAPP) to emerge.

CAPP works not by syncing the memory back to the protection site at all times; instead, when you detect stress on the network, you fire up virtual machines in a cloud environment and synchronise the sites then.

An example of stress could include a power failure or spike in demand, but Hayward insists that in most instances you would still have enough time to fire up virtual machines in the cloud to failover.

"When you detect stress on the system, and normally disasters are preceded by some sort of stress, take advantage of that situation and evacuate your servers before they go down," said Hayward.

"With CAPP you don't have to keep the protection site active until you need to detect stress, and then you can put everything over very quickly," he added.

"This is a cheap solution – you don't have to provision virtual machines unless you need them, and there is zero downtime and data loss in most disasters."