« Space is Mind Bogglingly Big | Main | Micrometer, Crayon, Ax »

September 24, 2008

Comments

This is a great write up...

But, did you really mean "just" 83TB instead of 83PB used space? (I already hear the wolves cry...)

And, I think "non duplication" is a great term to express what we can do in this area, so why doesn't it show up on the NetApp website, just like deduplication (http://www.netapp.com/us/products/platform-os/dedupe.html)? Has it been trademarked yet......?

PS - the link http://www.netapp.com/us/library/15618472.html comes back empty handed...

Aaaarrrrgghh!

Yes, 83PB not 83TB! Good spot; I'll correct and repost. Link too; worked for me when I checked it out, but I'll find another that works if you're having problems.

Non-duplication is the first technology to use, then deduplication. Why? It's free to do if you do it right. Dedupe always takes horsepower.

I was hoping that you could help me understand something. I got to this post from a link at the Register on an article on the Isilon 80% guarantee. I looked at the table above and did a bit of math. It says that systems have 144 TB of usable space and 83TB is use. I assume the 83TB represents the amount of actual data. If I divide the 83TB of used space by the 144TB of usable space, I get 57%. Does that mean the actual Net App utilization ratio for these systems is only 57%?

Cole; thanks for asking.

57% is the ratio of used to usable disk space. Usable to total disk space (which is what Isilon are claiming) is covered in Part 2 Space is Mind Bogglingly Big, and is 66%.

I also document the source of the data there. Utilisation rates vary by application. Database SANs tend to have low utilisation because the need a large number of spindles for performance, and disk sizes are getting larger. They can be very low; 10% or so.

File based NAS systems are far higher. Some systems in the data were near 100% of the usable.

That's the problem with Isilon using this blog as "evidence" of utilisation; it's apples to oranges, since they don't do SAN. As I noted about the data I used;

There are systems with low utilisation; particularly newer systems that are at the start of their lives. Older systems have higher percentages.

More pertinent; not shown are space saving technologies like thin provisioning and deduplication. These are technologies that can save huge amounts of space by virtualising the storage...

Because of the unique architecture of NetApp systems, some things really don't matter. SAN or NAS; who cares?

Isilon is NAS only, doesn't deduplicate data, and I'd love to see what rates they get from real systems in the field.

I'll bet Isilon's actual used space to total very low indeed.

Alex - thanks for the explanation. I think that I am understanding what the charts are representing. Could you tell me if this is a fair way to interpret the information you provided?

Across the 7,597 systems, your customers purchased 217 PB of raw storage. Of that 217 PB of raw storage 73 PB went to system related overhead. The overhead is for things like spares, RAID, file system OS, aggregate reserves and other items. This results in approximately 66% of the raw storage being usable – leaving 144 PB of usable disk space to actually place data on.

Is this 66% what most people would call “utilization rate”? I believe this is consistent with utilization rates that I have heard from others regarding Net App overhead and that other vendor’s systems tend to have lower utilizations rates – their systems are not as efficient as Net App. Is that true as well?

Then if I look at your other chart, I see that the 144PB of usable storage has 127,584 volumes on it and those systems hold about 83 PB of data. That represents the 57% ratio of used to usable disk and about 43% or 61PB of the usable space is un-used. I think this is consistent with how you explained it early. I believe the point you are also making is that other vendor systems would have lower ratios of used to usable b/c their systems allow for less efficient storage management when it comes to things like snapshots. Is this correct?

So, if I take the last step in this and divided the total raw storage (217 PB) by the total used storage (83 PB) I get 38%. Would this be a fair representation of the “gross utilization” of all these systems?

I have tried to study these ratios over the years and have always found it difficult to come to a common set of terms and a set of data that illustrates these concepts. I think that it is great that you are so willing to share this data. I just want to make sure that this a fair way to define and interpret your data?

That's a pretty good summary.

The Isilon guarantee, by the way, is not the same as the NetApp one. Isilon is guaranteeing 80% utilization of whatever they sell you. NetApp is guaranteeing that you will use 50% less storage than you use today.

That's a different animal altogether.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

© NetApp, Inc.  |  "Safe Harbor" Statement  |  Privacy Policy