« Extreme and Surprising Events Happen More Often Than You Think | Main | Green Data Centers and Solar Villages (Change is Most Likely When Heart and Wallet Align) »

July 20, 2007

Comments

Your comments about the storage industry are very refreshing. But please have a look at the source code of this site: http://now.netapp.com, you will notice a href to http://now-devel.netapp.com/images/trans_spacer.gif.

I truly hope that the WAFL code is better!

-- Martin Mueller


--------------------------------------------------------------
Thanks for the bug report Martin! I sent it to the NOW web team and they tell me it's fixed now. They are looking into what went wrong, and how to stop it from happening again.

And yes, I hope the WAFL code is better too. As you might imagine, our test process for ONTAP releases is somewhat different than for web pages.

-- Dave Hitz

Wow, you are comparing a mistake in HTML of a web page with a rock solid file system with years of real-world usage, and powering HUGE stores of data around the world... I truly hope you were joking.

I agree. I seriously doubt that a company the size of NetApp has their WAFL software engineers doing double duty as HTML developers. But speaking of misleading benchmarks, I ran www.netapp.com through the W3C web validator http://validator.w3.org/ and found that it failed with 78 errors. I also ran www.emc.com through the same validator http://validator.w3.org/ and it too failed with 357 errors.

Following Martin's logic NetApp's WAFL engineers are much better than EMC's.

Too funny!!!

Spec benchmarks are very generic tests of one pre-defined sets of performance heatruns.

They do not reflect every customer environment and it also does not reflect how systems behave with millions of files where most systems fail.

Comparing the 1ms cutoff doesn't buy any customer anything in your EMC example.

In addition comparing a 3070 cluster with a single Titan head is also inappropriate.

-- Benchmark reader

-----------------------------------------------------------
I choose the single Titan head because I wanted to compare systems with roughly similar maximum ops. Comparing the 3070 against the 2-node Titan is also interesting. In that case, what you see is that the 3070 has 30% more ops at the 1-ms cutoff, even though the 2-node Titan has over DOUBLE the maximum ops.

I disagree that the 1ms cutoff doesn't buy any customer anything. For I/O bound applications, storage latency is the key limiter to performance. If the storage can respond in 1ms, then you get 1000 responses per second to a single thread. If the storage responds in 2ms, then you get 500. The difference between 1ms and 2ms seems small, but it cuts the performance of your application in half. Even with a 2ms cutoff, less than half of the Titan's ops are usable. For the 3070, 86% of the ops come in below the 2ms cutoff.

Of course, if you have a bazillion threads, each doing only occasional I/O, then the difference between 1ms and 2ms probably doesn't matter. I won't argue that a server that can do lots of operations slowly is never useful, just that it often isn't.

By the way, I completely agree with your point that SPECsfs doesn't reflect every customer environment, or systems with millions of benchmarks. I like SPECsfs, but it certainly isn't perfect.

-- Dave Hitz

@mike jones:

Just how many WAFL engineers does EMC have? :)

As an aside, shouldn't the specsfs numbers take into account the price of the system too, the way spec benchmark for tpc does ? One can design an expensive system (say by doing a bunch of stuff in hardware) and report huge numbers, but I would think one would want to look at price/performance also when comparing systems.

-- Aalop Shah

--------------------------------------------------------------
Aalop,

I like the idea of including system costs. For most customers, price/performance is at least as important as overall performance. For some reason, SPEC has never managed to overcome vendor objections to including price data. I don't know the details.

Even if you got the list price, you still wouldn't have a full apples-to-apples comparison, because different companies have such different discounting policies. But I have to agree that it'd be better than nothing.

-- Dave Hitz

A least EMC stopped using Raid 0 for these kind of benchmarks.

Still two CX3-80 as backend is not a real world "customer purchasable" configuration, specially compared to a FAS3070C (4 storage processors VS 2 storage processors).

SPEC benchmarks should have also a price/iops indication of the proposed configuration.

I find it interesting that you push people to look at the SPECsfs testing. Those results are not apples to apples comparisons to say the least.

Lets start out by looking at the number of spindles behind these tests, since in the end spindle speed is really the bottle neck factor in most tests. BlueArc was using 200 disks, NetApp was using 224, and EMC was using 300. So based on that alone I say kudos to NetApp, on the surface you are not only quicker in terms of latency, but you are using less disks then EMC.

However that is just one layer below the surface of this benchmark.

The next layer to look at is load generators. It took NetApp alomst 2x as many load machines to come up with those numbers.

The last layer I look at is the most telling of how to beat a benchmark. I am calling you out on it. NetApp was the ONLY vendor to run against 2 separate file systems. Both EMC and BlueArc ran a single file system and single name space.

I look foward to your response.

Benchmarking can be very misleading and you stated this in your opening:

There are three kinds of lies: lies, damned lies, and benchmarks.

You can never do an accurate benchmark tests no matter if you have the same spec for spec storage array and infrastructure environment mainly because you can not apply this in the real world. In the real world not every customer is the same so benchmarking is only used as a guide.

Storage vendors no matter who, will only use benchmark results if the results are positive to their product/technology. I'm sure EMC or Titan can release benchmark results that shows their products to be more superior to that of NetApp.

There's only one way to run a proper benchmark, run the same spec array from NetApp and EMC for example over 100 (the more the better) different customer environments and use the statistic results as a real-world guide. Most importantly, such tests must be run independently and with no influences from any storage vendors at all.

So thank you very much for your candid report. It's very informative, but as you stated:

There are three kinds of lies: lies, damned lies, and benchmarks.

I will not use the results you provided when designing and architecting storage and infrastructure solutions to my customers.

The comments to this entry are closed.

Subscribe to This Blog




© NetApp, Inc.  |  "Safe Harbor" Statement  |  Privacy Policy