« Lawsuits and Football | Main | Legal Question about Amber Road »

November 07, 2008

Comments

So from an enterprise storage level you are talking about a combination of flash (SSD) and typical hard drives to assist in speeding up i/o. Similar to say the new Storage 7210 and 7410 systems have for read and write acceleration for a 3 level DRAM, SSD and SATA/SAS storage solution.

Sounds good.

Sorry left out the fact those are from Sun.

Brian: What you mentioned is the Hybrid Storage Pool, under the auspices of Sun OpenStorage, that one may ponder on
http://wikis.sun.com/display/BluePrints/Deploying+Hybrid+Storage+Pools+With+Flash+Technology+and+the+Solaris+ZFS+File+System

and on blogs.sun.com/openstorage

"Seeks per second *per gigabyte*"? Whyever is that a sensible measure of anything? I mean, why is that better than measuring your performance while *not* hitting I/O by "instructions per second per megabyte"?

If you get a whole pile more storage capacity and you remain able to seek to an arbitrary place within it at the exact same speed, then in no sense has your speed got worse. You might be disappointed that it hasn't got better in the same way as the size has, but that's not at all the same thing.

Not that I have any quarrel with the idea that mass-storage systems ought to consist of flash backed by spinning metal; that's perfectly sensible (barring some sort of breakthrough that makes flash-on-its-own good enough, which seems unlikely any time soon).

Thanks Dave for another very intersting post.
Would be interesting to learn you point of view on the 2TB SDXC which will come (e.g.
http://i.gizmodo.com/5125341/new-sd-card-spec-supports-2tb-capacities
and
http://www.sdcard.org/developers/tech/sdxc).

>> "Seeks per second *per gigabyte*"?
>> Whyever is that a sensible measure of anything?

It's actually a very old (mainframe) concept called "access density". Assuming the application is getting the best cache performance from database, performance is bounded by the cache miss rate, which is the disk I/O rate. If the I/O rate for a given set of data is constant, you can determine the performance based on the data set size. Going to larger capacity drives with the same, or slightly faster I/O rates means the application goes slower as you load more data on the larger drives, compared with adding more spindles of the same capacity.

http://wikibon.org/?c=wiki&m=v&title=Time_to_get_serious_about_disk_access_density

Most large scale application work today is of this workload type. Google MapReduce / Hadoop know the performance joy of more spindles.

Sorting 1PB with MapReduce
http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html


Scaling Hadoop to 4000 nodes at Yahoo!
http://developer.yahoo.net/blogs/hadoop/2008/09/scaling_hadoop_to_4000_nodes_a.html

Thanks Dave. Enjoyed the reading. One note to add. In applications that require high transactional processing, FLASH is not much better than 15,000RPM HDD's. DRAM Based SSD's are the optimum choice when there are many reads and writes happening. Probably 100X Flash based on the application/block size etc.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Subscribe to This Blog




© NetApp, Inc.  |  "Safe Harbor" Statement  |  Privacy Policy