Almost every measure of computer performance increases exponentially. Here is one important exception: disk drives keep getting bigger, but they are not getting much faster. As a result, the number of seeks-per-second available for each gigabyte of data (seeks/second/GB) is plummeting.
Let me give a concrete example. Fifteen years ago, a typical one-gigabyte disk drive had a seek time of ten-milliseconds or so. Do that math, and that’s 100 seeks-per-second for that one gigabyte. Today, one-terabyte drives have maybe five-millisecond seek time. Do the math, and that’s 200 seeks-per-second for the whole terabyte, or just 0.2 seeks/second per gigabyte. Over the past fifteen years, your ability to access the data you own has gone down by a factor of 500.
That’s bad enough, but if you consider how many CPU instructions you can execute while you wait for the disk drive, things are even worse. Fifteen years ago, the Intel 486DX could do 54 MIPS (million instructions per second). Today Intel’s QX9770 can do 59,455 MIPS. For every millisecond you wait, today’s chips execute 1000 times more instructions.
Consider these facts together: From a human perspective, seeks/second/GB has gone down by a factor of a five-hundred, but from the CPU’s perspective, it is five-hundred-thousand times slower! Remember those old mainframe computers with tape drives on the front, jerking back and forth? To the CPU, tape drives back then seemed faster than disk drives today.
Disks are the new tape—worse than tape used to be—so we obviously need to find a new disk. That’s where flash comes in: ten times more expensive than disk, but for random access, a hundred times faster. For now, flash is mostly used in small portable devices, or for very high performance, but if you look forward five or ten years, I predict that people will think of flash the way they think of disks today—as the fast-access storage for everyday use. And they’ll think of disks as being more like tape—a slow-access media that’s useful for stuff you might want to look at someday, but not particularly useful for data you use all the time.
Flash is too expensive to replace disk right away, so first we’ll see a new generation of storage systems that combine the two: flash for performance and disk for capacity. I hesitate to compare this with the old HSM (hierarchical storage management) solutions that combined disk with tape, because those were so sucky—things that would take seconds on disk could sometimes take hours. They only worked for a limited subset of applications, and only with intricate and painful management. Fortunately, the performance ratio between flash and disk is much better than for disk and tape, so we will be able to automate the management and still get good performance. HSM without the M, if you will.
(This article is mostly about the industry in general. For more on what NetApp is doing, check here and here.)


So from an enterprise storage level you are talking about a combination of flash (SSD) and typical hard drives to assist in speeding up i/o. Similar to say the new Storage 7210 and 7410 systems have for read and write acceleration for a 3 level DRAM, SSD and SATA/SAS storage solution.
Sounds good.
Posted by: Brian James | November 12, 2008 at 04:07 PM
Sorry left out the fact those are from Sun.
Posted by: Brian James | November 12, 2008 at 04:07 PM
Brian: What you mentioned is the Hybrid Storage Pool, under the auspices of Sun OpenStorage, that one may ponder on
http://wikis.sun.com/display/BluePrints/Deploying+Hybrid+Storage+Pools+With+Flash+Technology+and+the+Solaris+ZFS+File+System
and on blogs.sun.com/openstorage
Posted by: Sandeep B | November 18, 2008 at 12:16 PM
"Seeks per second *per gigabyte*"? Whyever is that a sensible measure of anything? I mean, why is that better than measuring your performance while *not* hitting I/O by "instructions per second per megabyte"?
If you get a whole pile more storage capacity and you remain able to seek to an arbitrary place within it at the exact same speed, then in no sense has your speed got worse. You might be disappointed that it hasn't got better in the same way as the size has, but that's not at all the same thing.
Not that I have any quarrel with the idea that mass-storage systems ought to consist of flash backed by spinning metal; that's perfectly sensible (barring some sort of breakthrough that makes flash-on-its-own good enough, which seems unlikely any time soon).
Posted by: g | November 26, 2008 at 06:53 AM
Thanks Dave for another very intersting post.
Would be interesting to learn you point of view on the 2TB SDXC which will come (e.g.
http://i.gizmodo.com/5125341/new-sd-card-spec-supports-2tb-capacities
and
http://www.sdcard.org/developers/tech/sdxc).
Posted by: Soren Mikkelsen | February 27, 2009 at 03:12 AM
>> "Seeks per second *per gigabyte*"?
>> Whyever is that a sensible measure of anything?
It's actually a very old (mainframe) concept called "access density". Assuming the application is getting the best cache performance from database, performance is bounded by the cache miss rate, which is the disk I/O rate. If the I/O rate for a given set of data is constant, you can determine the performance based on the data set size. Going to larger capacity drives with the same, or slightly faster I/O rates means the application goes slower as you load more data on the larger drives, compared with adding more spindles of the same capacity.
http://wikibon.org/?c=wiki&m=v&title=Time_to_get_serious_about_disk_access_density
Most large scale application work today is of this workload type. Google MapReduce / Hadoop know the performance joy of more spindles.
Sorting 1PB with MapReduce
http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html
Scaling Hadoop to 4000 nodes at Yahoo!
http://developer.yahoo.net/blogs/hadoop/2008/09/scaling_hadoop_to_4000_nodes_a.html
Posted by: peter bach | March 02, 2009 at 02:35 PM
Thanks Dave. Enjoyed the reading. One note to add. In applications that require high transactional processing, FLASH is not much better than 15,000RPM HDD's. DRAM Based SSD's are the optimum choice when there are many reads and writes happening. Probably 100X Flash based on the application/block size etc.
Posted by: Kevin Gonor | March 03, 2009 at 05:56 AM