Flash-based solid-state devices (SSDs) are making headway in the datacentre, especially for read-intensive applications requiring high performance, but will continue to take second place for write-intensive applications, including many "big data" projects.
That is the opinion of Scott Dietzen, the Silicon Valley veteran who has held senior technology roles at BEA Systems and Yahoo among others, and has also been involved in start-ups Zimbra, WebLogic and Transarc at various stages of their development.
Now CEO of Flash storage array vendor Pure Storage, Dietzen claims that de-duplication and compression techniques can make Flash comparable in terms of price to mechanical disks for storage – but with performance that can be between three times and 100 times faster, depending upon the application.
Furthermore, Flash offers particular economies of scale for service providers in a shared-infrastructure environment.
"Because the de-duplication and compression algorithms only work on Flash – you couldn't deploy them on mechanical disk – you can actually get Flash cheaper in a shared form factor than you can deploying it in a direct-attached storage model," said Dietzen.
However, longer term, software vendors and organisations will need to rewrite many applications to take full advantage of Flash, because they are typically written with the shortcomings of mechanical disk in mind, he added.
"There are deep assumptions inside file systems and databases in order to compress the data and sequentially lay it out so that it's easy to read when they are running on top of disks," said Dietzen.
"Deploying the same applications on Flash ultimately frees you to shed a lot of that complexity – things like a database log are really just designed to keep as much of the input-output of the database as continuous as possible, so that you can write sequentially."
Big data applications such as Hadoop are deliberately designed to access data sequentially, effectively streaming data from disk and reducing the performance advantage of Flash over mechanical disk to just a factor of two or three – and hence undermining the economic case for Flash for such applications.
However, he added, because there is no penalty for random access, Flash is ideal for applications that require fast retrieval of randomly accessed data. Over time, applications will reflect and play to the strengths of Flash storage, he believes.
"You literally design your data structures differently. You don't have to care about things like alignments and block size. You don't have to care about input-output scheduling any more because it's not about trying to make the disk access as sequential as possible. It really doesn't matter," said Dietzen.
Storing data on shared Flash storage would achieve a 10-times improvement in performance compared to mechanical disk storage, while rewriting applications to take advantage of Flash would yield a further 10-times performance improvement, claimed Dietzen.
Critics have claimed that Flash storage is already running out of steam, technologically. However, Dietzen said that Flash has a long enough pipeline of technology improvements to make a shift to Flash – where appropriate – worthwhile.
"We are on the cusp of moving from MLC [multi-level cell] to TLC [three-level cell], which will add an additional 'bit' per Flash cell. The lithography of Flash is continuing to shrink and the next generation is moving to 3D Nand. That means 3D switches of lithography so we will fill the gates pointed down. That should enable us at least a couple more generations of evolution before we start to hit the limits of what can be achieved," said Dietzen.
TLC is currently dominant in consumer Flash storage devices, which the popularity of Apple iPods, iPhones and iPads, as well as Android devices, has helped to drastically push down in price. 3D Nand, in particular, will enable a huge leap in terms of storage capacity on Flash devices.