How to measure the durability and availability of your cloud storage

Any organisation looking to make the leap to cloud storage needs to understand the risk of data degrading or being inaccessible

Even before the events of this year, there was a clear trend of organisations moving their data to the cloud, with the market for cloud storage consistently growing by over 20 per cent year-on-year. But the disruptions faced by businesses this year - leading to the switch to mass remote working - have markedly accelerated cloud storage adoption.

In making the transition to cloud storage and assessing vendors, the top priority for most organisations is ensuring they can be confident in accessing their data whenever they need. This, in turn, means assessing two things: that their data has durability against degradation, and that they can be confident in their data's availability. Any organisation looking to make the leap to cloud storage need to know how to measure both.

Measuring durability

For durability, the first thing a good data centre should do is keep multiple copies of your data. Let's imagine that there's a one in a hundred risk of a file becoming corrupted in a given year. The best thing you can do to cut your risk is to have a copy of the file available, which you can then compare to the original to see if there's been any degradation in either. If there has been degradation in one version of the file, then you can simply use the uncorrupted version and overwrite the corrupted one. This means that the only way you can 'lose' the file is if both copies of it become corrupted at once - which cuts the risk down from one in a hundred to one in ten thousand.

So, the more copies we make of data, the greater the reduction in risk. It's also the case that the more often we compare copies of files, the more we'll cut down the risk of losing them. When it comes to cloud storage, the best vendors tend to do such a comparison, sometimes called an active integrity check.

What does this mean in practice for the durability of files in cloud storage? You can expect a good vendor with quality hardware to store multiple copies (say five) of their customers' data with regular integrity checks. Currently, at the state of the art this means that you can expect to lose one file out of one hundred billion (that's 100,000,000,000) per year. In the industry, this high benchmark is referred to as hitting 'eleven nines' of durability, in reference to the fact that there's a 99.999999999 per cent chance a customer will not lose a given file in a given year.

In practice, if you gave a storage company that had eleven nines of durability a million files to store, statistically it would take them 659,000 years to lose one of them. That's pretty good odds.

Tackling human error

It would be very convenient if most cases of data loss were beyond our control. This is sadly not the case - two-thirds of data loss incidents are not due to hardware failure, but are instead caused by human error, software misconfigurations, viruses or malicious actors. That's why any conversation around data reliability has to take into account reducing the risk of human intervention destroying your data.

For this reason, I'd advocate for any company using cloud storage to consider using ‘immutable buckets' from their vendors. These are storage offerings that prevent anyone from modifying or erasing a stored set of data within a particular time frame, including the administrator and the team at the data centre itself. Once you've written data into an immutable bucket, it's there until the hold time that you've designated has expired. If someone tries to erase or modify an immutable file, they'll just get an error message. This radically cuts down the risk of human error, which (again) make up the overwhelming majority of cases of data loss, destroying business-critical files or backups.

Securing availability

Even if your data is safe from corruption, it won't mean much if you can't access it when you need because the data centre is offline due to power outages, internet failures on their end, or a misconfiguration. If a data centre guarantees 99.9 per cent uptime, that means it will be offline 0.1 per cent of the time, or roughly nine hours per year. Alongside these hours of downtime, there's also the small risk of a natural disaster of some sort destroying your data at the data centre.

Generally, then, it makes sense to have your data stored at multiple data centres to increase the availability of that data for your team. Let's assume you store two identical copies of data at two data centres with 99.9 per cent uptime: you've now effectively increased your uptime to 99.9999 per cent, and decreased your downtime by a factor of a thousand - from 9 hours to 32 seconds per year.

The recommended advice for availability corresponds closely with the advice for durability - backups radically reduce risk. When it comes to measuring the durability of your cloud offering, gold standards like eleven nines of durability will necessarily correspond with keeping several copies of your data, in addition to performing frequent integrity checks. However, you should consider going the extra mile and setting your data to be immutable, to cut down the far greater risk of human error. If you do your due diligence and ensure your vendors offer you these features, you should find cloud storage gives your organisation marked peace of mind when it comes to your data.

David Friend is CEO and co-founder of Wasabi.