Marketing claims recognized by independent experts
About a month ago, B&S editor Mary Jander published a blog covering the recent Green Enterprise IT awards, which featured NetApp and 7 other winning vendors. NetApp was cited for improving primary (gross) usable capacity from 40% (already above the Unix/Windows industry average of 30%) to 60% for our internal production systems which feature all advanced data protection and replication features enabled. Note that the judges reviewed NetApp's infrastructure for this award before we released our unique FAS deduplication functionality for primary as well as secondary data sets. Our internal analysis shows deploying FAS deduplication increases NetApp usable capacity numbers even higher.
NetApp was also recognized by the Uptime Institute in that same study for reducing our total number of storage systems from 50 to 10, cutting our number of racks from 25.83 to 5.48, and decreasing direct power consumption by 41,184 kWh per month.
It's not easy being green
However over in the comments section of that B&S blog today, one user seemingly wasn't impressed by the resulting 60% usable capacity number, so I offered up a short explanation (comment #6). As Paul Harvey likes to say, here's the rest of the story...
Bottle rockets and Space Shuttles
While it’s tempting to apply the same basic capacity concepts we use to manage space on our desktops/laptops to enterprise storage, it’s a futile exercise in frustration and confusion akin to comparing Bottle Rockets to the NASA Space Shuttle Program. Both will let you see a lift-off if you do your homework, but with many caveats about the payload and vastly different reactions from your neighbors :-)
The formula
UCF = R(n) + I(n) + P(n) + C(n) + M(n) +D(n)
Without resorting to an elaborate LaTeX layout for this blog, that basic equation will have to do. Read on for how to decipher it...
Deceptively simple math, many possible answers
The real answer to usable capacity of enterprise storage systems is always an output function of several key input variables(R,I,P,C,M,D). Since those variables often differ between every organization, the result of the usable capacity function (UCF) is correspondingly unique in many cases, leading to the overall confusion about what seems to be a simple question.
Data protection, features and convenience vs. capacity efficiency
Some of the variables in the UCF can be directly mapped to layers of an enterprise storage array “stack”. These input variables are highlighted by (but not limited to) this list below:
- (R) Disk failure protection (RAID, hot spares and disk rightsizing across multiple suppliers)
- (I) Data Integrity (Checksums)
- (P) Provisioning (Thick or Thin)
- (C) User data protection (Snapshots & Clones)
- (M) Replication (i.e. Logical mirroring inside & outside the array, locally and remotely)
- (D) Deduplication (Primary & Secondary tiers)
Atomic Perspective
Much
like over-simplifying the problem, it’s also tempting to obsess about
the advantages and disadvantages of just one particular layer (input
variable), while ignoring the sometimes larger impact of the other
related ones. Every organization needs to determine the subset of
which (if not all) of these layers matter, and then treat them in an
atomic manner when calculating the result of their UCF. It’s fun to
get distracted by protons, electrons and nuclei (not to mention quarks,
neutrinos and positrons), but in the end the atomic unit for your
business requirement is all that matters (ok the puns are almost over
:-)
Bottom-line, it's really Nuclear (Atomic) Science, not Rocket Science
Many storage vendors love to compete on the basis of just one of their best quarks, or throw FUD at a competitor's neutrinos. Don't fall for it - stay atomic!
Be holistic in your evaluation and review *all* configuration best-practice recommendations from your short-listed vendors before. Vendors offering advanced scalable provisioning technology will also help you recognize greater capacity efficiency in your environment than having to manually configure it yourself, or worse yet have to rent expensive white lab-coat guys to do it for you because their storage is so radioactive they don't recommend customers actually touch it.
Postscript
The most relevant NetApp best practices for optimal capacity utilization (in this case storing enterprise apps) are available here & here. Other application examples are also available here.
Note that with the most recent versions of Data ONTAP (7.2.4 and
higher), thin provisioning of snapshot space for LUN’s (via FC or
iSCSI) is the default recommendation. That means up-front “space
reservation” or “fractional reserve” is no longer required.
I also incorrectly referenced an outdated StoreVault paper in my B&S response. The current product line is known as the NetApp "S" family (i.e. S300 and S550) and includes some cool innovations for the SME market. Users now have more control to reduce snapshot reserves on those systems from defaults of 20% for NAS and 150% for SAN/iSCSI respectively. A LUN snapshot auto-tuner is also available to reclaim unused space.
The updated NetApp StoreVault paper explaining capacity utilization on that product family in plain terms is available at this link:

Would it be possible to present all this complexity as a simple example?
Like - given a NetApp filer with 200ea. 500GB drives (100TB raw), could you show how much of that capacity is available as the maximum USABLE capacity that would be available for use by a high-performance OLTP-type database application with a 30/70% write/read ratio?
Put another way, could you really support a 60TB database on that configuration (60% utilization)?
Posted by: the storage anarchist | June 03, 2008 at 10:19 AM
Anarchist - A simple question deserves a simple answer. We can support at least 62TB usable under your given scenario, and likely 75TB usable or more, depending on how NetApp customers configure their snapshots and FlexClones.
More details here.
Posted by: Val Bercovici | June 03, 2008 at 10:44 PM
Thanks for the honest answer - but 62% utilization isn't really all that much to crow about.
For the same conditions, Symmetrix delivers just over 87 TBu out of 100TB raw - that's 40% more usable capacity than NetApp.
Plain and simple.
That would mean that NetApp Filers require more disk drives to deliver the same usable capacity as a Symmetrix. And given that disk drives are the primary consumer of power in a system, logic defines that NetApp Filers are thus less "green" than are Symmetrix DMX's...
Next you'll probably drag out some benchmark and try to convince people that you do more with less...
But the simple fact is that there's an awful lot of wasted, unusable capacity in a NetApp Filer...even BEFORE you start stressing WAFL.
Posted by: the storage anarchist | June 04, 2008 at 04:05 AM
Anarchist - until you provide independent 3rd-party validation of that 87TB number (which I highly doubt since you're proposing RAID 5, whereas your own best-practices dictate RAID 1/0) it is nothing more than a figment of your imagination.
Posted by: Val Bercovici | June 04, 2008 at 12:41 PM