System boosts storage efficiency
Plans to add data deduplication features to virtual tape library products later this year.
Virtual tape library specialist Sepaton opened a UK sales office last month. In an exclusive interview with IT Week, Christo Conidaris, Sepaton sales director for Europe, Middle East and Africa, said the firm will later this year release a data de-duplication add-on that will prevent its system from storing more than one copy of any file. Where there are only small differences between two files, the system will store the original file and only the differences needed to recreate the second version, he added.
This type of feature is popular because it reduces the amount of storage needed, and is common in many backup products. However, Sepaton uses a unique approach of comparing files on a byte by byte basis to see if they are the same. Other systems use hashing algorithms to decide whether two files are identical. But Conidaris said the use of hash-based systems might be unacceptable to some customers. This is because researchers last year discovered fundamental flaws in some hashing algorithms, raising doubts about whether hashing is sufficiently reliable.
Researchers showed that a simple document stored in Adobe’s PDF format could be modified by a hacker so that a hash algorithm would be unable to distinguish it from another specific document. Although the chances of such a situation happening by accident are calculated to be extremely small, a hacker could easily send such a pair of documents into a company’s repository, for example, if a firm’s IT policy mandates that all incoming email is stored for several years. Therefore, firms operating in heavily regulated industries might feel that relying on hashing algorithms for data backup applications is not best practice.
Sepaton’s product is also interesting because its grid-based architecture enables performance to be scaled up from the entry level of 300MB/s to 9.6GB/s using a grid of 32 controllers. Prices start at about £25,000 for a 4.8TB capacity system, which is expandable to a maximum 5PB capacity with a range of throughput options.