Tech Talk

April 10, 2008

Server Virtualization Trend: Just Starting or Almost Done?

Something has been bugging me about the market share numbers for server virtualization. Is the trend is just getting started, or is it almost finished? The numbers I’ve seen say that under 10% of all X86 servers have been virtualized – maybe 7-8%. By that measure, the trend of converting physical servers into virtual ones seems to be quite early.

Things look very different when I look at the percentage of total servers (physical plus virtual) that are virtual. Most customers seem to run at least 8-12 virtual servers per physical, and some are pushing past 30 towards 50. Let’s use 10 as a conservative number, and do the math: For every 100 physical servers, 7 are virtualized, for a total of 70 virtual servers. That makes a total of 163 total servers (70 + 93), and almost half are virtual. If we are half-way converted, then the virtualization trend must be very far along, because the second half will probably convert much faster than the first half.

It sure seems to me that looking at total servers is the right thing to do, as opposed to just counting how many physical servers are virtualized, because to a user, it shouldn’t make any difference whether their server is virtual of physical. (That’s the whole point!)

On the other hand, it doesn’t feel right that server virtualization is so far along. Most customers I talk with are just getting started. Only a few have seriously converted. My math must be busted, because there’s no way that we are half-way converted.

I think the problem is that the math assumes that there is a fixed-size pool of servers that people are converting from physical to virtual. It seems more likely that cheap and easy-to-provision virtual servers will lead to a massive increase in the total number of servers. That is always what has happened when a computing resource gets much less expensive. We didn’t just replace workstations with PCs, we gave PCs to everyone, instead of just development engineers. Likewise, the cost per gigabyte keeps dropping every year, but instead of buying fewer gigs, people keep storing more and more, and their budgets stay roughly flat.

Given the history of the computer industry, it seems unlikely that server virtualization will drive costs down in the way people think. Instead, it seems much more likely that costs will stay roughly flat, but there will be a radical proliferation in the number of virtual servers. They are just so fast, easy and cheap to deploy, it seems likely that most IT shops will hand out scads of them.

I’m not sure whether to think of this as a prediction, or a warning. I guess if you get value from all those virtual servers (just like we did from mini-computers, workstations, and then PCs), then there’s no problem. But if IT shops really want to use server virtualization to save money, then they had better be extraordinarily disciplined.

February 01, 2008

Controversy: NetApp Outperforms EMC in SAN Database Benchmark

We just released benchmark results showing that our FAS storage systems outperform EMC’s CLARiiON on SAN database workloads. For details, see Brian Pawlowski’s blog. The quick summary is that a NetApp FAS3040 beat an EMC CX3 Model 40 on SPC-1, which is an industry standard benchmark that measures OLTP (Online Transaction Processing) performance. Our system cost less, had fewer disks, and beat the EMC by 24%. With snapshots enabled on both systems, NetApp was three times faster. Here’s the chart:

Fasvscx_2

Why the controversy? EMC has never posted any SPC-1 results, so we had to run the benchmark ourselves. We did follow EMC’s best practice document for the CLARiiON, and we did have the Storage Performance Council independently audit the results. But still, the fact that we ran the tests ourselves caused concern. For instance, Chuck Hollis, the VP of Technology Alliances at EMC, raised questions about the integrity of the auditor, calling him “an infamous part-time ‘administrator’ for the Storage Performance Council” and saying that “It seems to be a part time gig for him to make a few bucks.” Chuck also raised questions about why we would do this: “The only reason you'd spend the money to buy the equipment and run the tests is to put your competitor in a bad light. I think most reasonable customers would figure that part out.”

It is fair to ask why we did this, so let me share our thinking:

First, to continue improving our SAN and database credibility. Given our NAS origins in the early 1990s, many customers and analysts have been skeptical of NetApp’s SAN capabilities. Results like this show today’s reality. Steve Duplessie at the Enterprise Strategy Group commented that “Netapp appears to have legit block performance, and shouldn't be dismissed because people (like me) presume it can't be true.” Chuck Hollis is a vocal skeptic of NetApp’s ability to play in SAN and database environments, so it shouldn’t surprise him that we want to refute his claims. (See here for a brief history of our benchmarking efforts over time.)

Second, to showcase our snapshot performance. Snapshots help customers improve backups and archive old data, and writable snapshots (FlexClones) let customers completely rethink their database test and development strategy. Unfortunately, snapshots in most storage systems are unusably slow. With NetApp, performance dropped only 3% with snapshots on. With EMC, performance dropped by a factor of three. I freely admit that our goal in focusing on snapshot performance was to “put our competitor in a bad light.” I think that’s fair because EMC’s snapshots really are painfully slow in real-world use. On the other hand, we expected our non-snapshot performance to be about the same, or maybe even a bit lower, given that our system is less expensive, has fewer disks, and uses RAID-6 instead of mirroring. Winning there was a pleasant surprise.

Third, because Chuck asked us to. In his blog entry on SPC, Chuck said: “We've never done an SPC test, and probably will never do one. Anyone is free, however, to download the SPC code, lash it up to their CLARiiON, and have at it.” I don’t promise always to follow Chuck’s advice, but I think it’s important to recognize good ideas no matter where they come from!

One key take away from this result is that turning on a simple feature like snapshots can radically change performance. Don’t let a bad experience with EMC’s snapshots scare you away from NetApp’s.

Let me close with a final word on benchmarks. Any honest vendor will agree that benchmark results are sometimes misleading, and that you should examine the details carefully. We believe that SPC-1 effectively simulates OLTP workloads, and we used real-world configurations based on each company’s own best practices documentation. But despite our best efforts, I stand by my argument in a previous blog entry that you should admire and respect great benchmark results, but also be careful.

 

 

The following SPC-1 results have been posted at www.storageperformance.org:

NetApp FAS3040 (baseline):
http://www.storageperformance.org/results/benchmark_results_spc1#a00057


NetApp FAS3040 (with Snapshots):
http://www.storageperformance.org/results/benchmark_results_spc1#a00058


EMC CLARiiON CX3 Model 40 (baseline):
http://www.storageperformance.org/results/benchmark_results_spc1#a00059


EMC CLARiiON CX3 Model 40 (with SnapView):
http://www.storageperformance.org/results/benchmark_results_spc1#a00060


All comparisons are current as of January 29, 2008.

 

December 06, 2007

Test and Development Copy of an Oracle Database: 2400% Speed Up

Niel Armstrong is the CIO of Activision, and we gave a talk together at Oracle Open World. Niel uses pretty much the full suite of Oracle products, including the database itself, much of the middleware stack, the eBusiness Suite, and Hyperion. He also uses pretty much the full suite of NetApp products.

My favorite story was about the speed up Niel saw in making copies of his production database for test and development. Before he started working with NetApp, test and dev copies were a real pain. It took a “long week” to create a full copy. (A “long week” is when you start at close of business Friday, keep working for the whole week, and don’t finish till Sunday night the weekend after. Or maybe Monday morning.) It was so painful that they did a process-re-engineering project that drove the time from roughly nine days down to four. Niel said, “We felt great about cutting the time in half. This was a big project for us, and we worked really hard at it.”

When Niel switched to NetApp storage, he started using NetApp’s copy-on-write clones (FlexClone) to make test and dev copies, and he cut the time from 4 days to 4 hours. Actually, he said it’s usually 2 hours, but he says 4 just to give himself some buffer.

This completely changed their workflow model. Niel said, “Before NetApp, we used to really fight these clones. We just didn’t want to do them. No more than one or two per large Oracle project if possible. Now we do them all the time. I’ve got over a dozen clones right now just for the UK Oracle implementation. One guy wanted a clone for a training class, and we gave it to him. We never would have done that before. It would have taken too long and used up too much storage.”

Clones are handy during normal operations, but they are especially important when you are making a major change to your environment. “Without NetApp clones, we would never have finished our upgrade in time.”

What I love most about Niel’s story is that it is almost exactly the same thing I wrote about last year after Oracle Open World (read here), except then I focused on the demo we were giving in our booth, and now I’m describing a customer who is in production and who stood on stage next to me to describe his experiences.

Let me give you a sense of Niel’s environment. Activision has game design studios all around the world. A big problem was getting reliable backups at the studios, so they implemented remote disk-to-disk copies (SnapMirror) from each studio to their disaster recovery center in Burbank, as well as replicating all of the headquarters data. A few Sundays ago, disaster recovery was “tested” by two back-to-back power failures, courtesy of SoCal Edison. They recovered in less than 4 hours, with no business interruption, which earned them attaboys from both the Chairman and the CEO.

I asked Niel whether he mostly makes clones from his DR systems, or his production systems, or what. His response: “We make clones pretty much everywhere. It’s clones all over.

November 12, 2007

NetApp’s VTL: Best for Tape Libraries You Can Park Your Car In

We’ve been shipping our NearStore Virtual Tape Library (VTL) for about 18 months, and it’s interesting to dig into what customers actually like about it. When we first shipped, we had a good sense of the important benefits. We believed that virtual tape would help customers:

  •  reduce their backup window
  • reduce their restore time
  • reduce the failure rate of restores

We were pretty close.

For instance, one of our most popular features is fast hardware-based compression, which lets the VTL run just as fast with compression as without. Without special compression hardware, you would still save space, but performance would be much lower. Saving money (by using less disk space) is always important, but meeting the backup window is so critical that compression without performance doesn’t cut it. (Stay tuned for VTL de-duplication, next year, which will further reduce storage costs.)

On the other hand, I’m surprised how important creating real tapes from the VTL has remained. Despite all the hoopla about disk-to-disk backups, eighty percent of VTL customers still rely on tape for some part of their process. One lesson here is how reluctant people are to eliminate processes in their “data protection path”. I expect that customers will be creating tapes and offsiting them for the next decade, so “work really well with my existing tape infrastructure” is a critical VTL requirement. What has changed is that customers go to tape much less often. With VTL, most customers go to tape weekly or less, instead of nightly, which lets them get much more mileage from their existing tape infrastructure.

The interaction between virtual tape and real tape is a fertile ground for innovation. The ability to support many virtual tapes at once, and then write real tapes later, means that you don’t have to schedule multiple jobs to the same physical tape drive. A problem with one thread doesn’t affect everything downstream, which makes a chaotic system much more deterministic. Staging backup data on disks also allows you to run tape drives at full speed -- avoiding the stop-starts that kill performance. This helps stretch existing tape infrastructure still further.

A final thing we’ve learned is that our VTL is so powerful that we tend  to compete best in large data centers. At one point we explained to our sales force, “You’re odds of winning are best if the customer has a tape library that you can park your car in.” We have introduced smaller products since then, but we need to come down even further to cover the whole market. We’re working on it.

 

October 03, 2007

Is NFS a Form of SAN? (NFS for Enterprise Apps)

I recently wrote about how both Oracle and VMware support NFS for situations that would have traditionally used block-based storage, like SAN or iSCSI.

Here is my favorite reaction to Oracle's Direct NFS Client:

This is so wrong, but I guess they're so far down the path it no longer matters.
-– Wes Felter 

I disagree that it’s “wrong”, but it’s important to ask why Oracle over NFS gives some people the heebie-jeebies. Perhaps people worry that the NFS protocol might be “chatty” – somehow less efficient at transporting data across the network. It’s true that NFS has lots of fancy requests that iSCSI and Fibre Channel SAN do not, to create directories, set permissions, move files, and so on, but none of that matters when all you do is read and write blocks of data.

At the protocol level, block-reads over NFS and block-reads over iSCSI are almost identical. The main difference is that NFS asks for a certain number of bytes, starting at a given byte offset, and iSCSI asks for a certain number of blocks, starting at a given block offset. With NFS you must divide by 512 to convert bytes to blocks. So what! The other difference is that iSCSI and FC-SAN use a LUN to identify the container holding the blocks, and NFS uses a file handle. I’m over-simplifying, but you get the point. At the protocol level, for block traffic, there is almost no difference between NFS and iSCSI or SAN.

A LUN and a file are both just containers that hold blocks of data, so get over it. It was this similarity that originally convinced me NetApp could “unify” SAN, NAS and iSCSI into a single storage appliance. (This is the key idea behind unified storage.)

There are significant differences in how the protocols connect to the host operating system. FC-SAN and iSCSI both plug in at the block layer, while NFS plugs in at the filesystem layer. Those have very different paths through the OS. Historically, bugs in the filesystem layer slowed down NFS. For instance, NFS in Solaris used to be single-threaded, so multiprocessor systems only got one CPU worth of NFS performance. Those bugs have been fixed, but in most operating systems, the block layer is still a slightly lighter-weight interface.

This is why it is so significant that both VMware and Oracle have now built NFS directly into their software as if it were a block interface. When Oracle generates NFS packets directly, it completely bypasses the OS filesystem layer, and it reduces the difference between NFS and iSCSI to the minimal protocol differences that I described above. VMware takes block level requests out of virtual machines and converts them into NFS requests, again blurring the distinction between NFS-as-NAS and NFS-as-blocks (or SAN).

Why bother converting blocks to NFS? After all, I just argued that NFS and iSCSI are almost identical for block traffic, so why not just use iSCSI if you want Ethernet, and FC-SAN if you prefer Fibre Channel?

Those extra NFS operations link to filesystem capabilities that are valuable for managing large numbers of containers. If you only had one LUN, or just a handful, it probably wouldn’t make any difference, but if you have hundreds or thousands of LUNs, it’s very convenient to give them meaningful names, group them into directories, and back them up with filesystem tools designed to handle thousands (or even millions) of separate objects.

NFS also lets you virtualize the data path. NFS provides an abstraction of the path from Application Server to Storage that customers really like and leverage. A whole industry of Fabric virtualization products is coming to market for Fibre Channel (think IBM SVC) to solve a problem already solved by NFS and TCP/IP. Path virtualization is why NFS does so well in grids. It’s the key to simple application mobility, and enables all kinds of resilience techniques.

To be clear, I’m not arguing that NFS is always the best solution for traditional block-based applications. We have many happy customers running Fibre Channel SAN and iSCSI, and we are investing aggressively in those protocols. My point is that many people simply reject NFS, mostly based on old data and misunderstandings. NFS definitely deserves consideration. Sometimes – not always – NFS is the best solution for mission critical enterprise applications.

September 27, 2007

Driving a Tesla Electric Sports Car is a Strange Form of Virtual Reality

This week, I got a chance to drive a Tesla electric roadster on the twisty mountain roads above Silicon Valley. All electric, lithium-ion, zero-to-sixty in less than four seconds – not that I stopped at 60. (Whee!)

The electric motor feels very different from an internal combustion engine. Much smoother. In my regular car, if I’m driving at 50 mph and then floor it, I get a jerk of acceleration right away, and then a second jerk an instant later, when it downshifts, and then a surge. In the Tesla, it was just continuous, smooth acceleration. This makes the car track wonderfully through curves. The jerks in a normal car sometimes threaten to shake loose the wheels if you hit the accelerator too much at the wrong time, but in the Tesla, the power is a smooth surge no matter how quickly you jam the accelerator.

In my regular car, if I suddenly let off the gas after accelerating hard, I get a reverse jerk as engine compression braking suddenly slows the vehicle. In the Telsa, no jerk at all. The acceleration stops, and then a gentle deceleration kicks in. An induction motor has no natural braking (no cylinders to compress), but it’s nice for the car to slow down when you lift off the accelerator, so they programmed in some regenerative braking. In a regular car, the amount of deceleration is a side effect of cylinder size, compression, gearing, and so on, but in the Tesla, they get to program in exactly what feels right. Turns out that people prefer just a bit of deceleration at high speeds, but at slower speeds, for street driving, people like more deceleration.

In a sense, driving a Tesla is like a strange form of virtual reality. The sensations I’m feeling are designed by a programmer who has full control of the mapping between accelerator pedal and torque. (In fact, one goal of the drive was to get feedback to the Tesla engineers on their firmware choices.)

Another example is creep, which is the way that cars with automatic transmissions creep forward slightly when you lift the brake. In a regular car, that’s a natural side effect of how the transmission works, but in a Tesla, they decided to maintain that same behavior for safety, so that you won’t get out of the car with it turned on. Lift the brake and it “reminds” you, by creeping, that it is still running.

I shouldn’t say virtual reality, because I was in a real car feeling real acceleration, but it has that same designed-by-a-programmer feeling that virtual reality does. At first I thought that designed reality might be a better description, but an internal combustion engine is certainly designed, and the acceleration and engine braking profile is something that designers take into account. I think the real difference is the level of direct control that you have with programming, as compared to the indirect influence you have when you design with physical materials.

What enables that feeling of programmed reality in the Tesla is the flat torque curve of their electric induction motor. Its torque is almost perfectly flat from 0 to 6000 rpm and then drops linearly to 50% torque at 11,000 rpm. An induction motor can generate full torque even at 0 rpm, and can go from no power to full power in milliseconds. By contrast, an internal combustion engine can’t generate any torque at 0 rpm (it stalls), and it delivers full torque only in a narrow range.

In order to generate programmed reality, you need a physical medium that is sufficiently malleable, like an induction motor or an LCD display, to give you full programmable control. Perhaps the real lesson here is that – given the flexibility of programmed systems – we should have a higher expectation of usability and design elegance for programmed reality. With great power comes great responsibility.

September 13, 2007

Why Run VMware Over NAS?

At VMWorld yesterday, I was surprised how excited customers are about using NFS to access VMDKs, even for virtual machines hosting Windows. (A VMDK is a VMware Virtual Disk, and it holds the boot image for its virtual machine.)

Since a VMDK is a virtual disk, I had assumed that block-based protocols like iSCSI and Fibre Channel would make more sense than NAS, so I asked several customers why they prefer NFS.

The answer is simple: Managing .vmdk files is much easier than managing LUNs. If you have 20 or 30 virtual machines, then VMFS is great for consolidating the VMDKs into a single LUN. But NAS is much easier and more scalable if you have hundreds or thousands of virtual machines.

The big advantage is that you can use all your file management tools. Group the VMDKs for Exchange servers in one folder, SQL servers in a second, virtual desktops in a third, and so on. Instead of backing up LUNs or virtual machines individually, simply backup a directory tree of VMDKs all at once. (This is much less expensive than buying a backup license for each virtual machine, and also easier to manage.) For disaster recovery, you can replicate the data for a whole group of virtual machines as a single unit.

Some people are surprised that you can use NFS for Windows virtual machines, since Windows can’t boot from NFS. This works because VMware has built NFS into ESX’s disk virtualization layer. ESX handles the NFS protocol so that the operating system doesn’t have to. This is very similar to Oracle’s recent project to build NFS support directly into their database.

I rarely find excuses to praise Chuck Hollis from EMC, but in a recent blog he said:

I think that, in the long term, we'll find high-end NAS much more friendly for high-end VMotion / DRS farms than today's SANs. And I think that NAS has the potential to offer a few benefits that we might not find in the SAN world.

Brilliant!

VMware’s Founder Helped To Inspire WAFL

At VMworld yesterday, I got to meet with Mendel Rosenblum, one of VMware’s founders. I want to share the story of how he helped inspire WAFL.

In the early days of NetApp, when we first started developing our WAFL file system, we drew inspiration from three main file systems: FFS, Episode and LFS:

The Berkeley Fast File System (FFS) was written by Kirk McKusick. I had worked on FFS at two prior companies (MIPS and Auspex), so I was very familiar with it.

The Episode File System was developed by Transarc, which spun out of the Andrew File System (AFS) project at Carnegie Mellon. One of the architects of Episode was Mike Kazar, who joined NetApp when we acquired Spinnaker.

The Log-structured File System (LFS) was developed as part of John Ousterhout’s Sprite operating system project at Berkeley.

The graduate student who actually designed and implemented LFS was Mendel Rosenblum. It took me quite a few years to figure out that this guy whose work I admired 15 years ago was the same guy who started VMware. Imagine my surprise!

Given that a VMware founder helped inspire WAFL, it seems there’s a sort of poetic justice that so many VMware customers use it for their data.

August 21, 2007

Oracle Optimizes Its Database for NFS

NFS has become critical to data center grid environments. As a result, Oracle has optimized its code specifically for NFS. Instead of relying on the operating system, Oracle’s Direct NFS Client generates NFS requests directly from the database.

Direct NFS was inspired by experience at Oracle’s Austin Data Center. Oracle uses NFS to run its applications on tens of thousands of Linux servers accessing many petabytes of NetApp storage. In 2005 they had 12,000 Linux servers and 3 petabytes of NetApp storage. Today’s numbers aren’t public, but they are much larger.

When an operating system capability becomes sufficiently important, Oracle pulls it into the database. Memory management became critical, so Oracle said, “Just give me the raw pages, and I’ll manage them myself.” Disk caching became critical, and Oracle said, “Just give me the raw disk blocks, and I’ll cache them myself.” Now NFS has become critical, so Oracle says, “Just give me a raw TCP/IP socket, and I’ll generate NFS requests myself.” 

Steve Kleiman has argued that as Oracle becomes more sophisticated, the operating system becomes little more than a device driver framework that gives the database raw access to the hardware. That sheds new light on Oracle’s Unbreakable Linux program.

What exactly does Oracle gain from Direct NFS? The primary benefits are simplicity and performance. 

It’s simpler because you don’t have to worry about how to configure NFS. What timeouts should you use? What caching options? It doesn’t matter. Oracle looks at how you have NFS configured to figure out where the data lives, but aside from that, your settings don’t matter. Oracle takes control.

It even works with Windows. Just mount the data that Oracle needs using a CIFS share, and Oracle figures out the location of the data and accesses it via NFS. (CIFS is great for home directory sharing, but it isn’t designed for database workloads.) 

Performance is better because Oracle bypasses the operating system and generates exactly the requests it needs. Data is cached just once, in user space, which saves memory – no second copy in kernel space. Oracle also improves performance by load balancing across multiple network interfaces, if they are available.

For more technical details on Direct NFS, check out this article by Kevin Closson. He works for PolyServe, which is a NetApp competitor, but technically speaking, he talks good sense. I also recommend this article, by NetApp’s John Elliott, comparing Oracle performance over Fibre Channel, NFS and iSCSI. 

NetApp has been closely involved in Direct NFS from the very beginning. Peter Schay came up with the idea while he worked for Oracle’s “Linux Program Office”. He wanted to simplify things for Oracle customers running on Linux, many of whom were hosted on Oracle’s On-Demand environment at the Austin Data Center. He worked closely with NetApp engineers to prototype and test the idea. The Oracle ST team used his functional specification to develop the production version of Direct NFS now shipping in 11g. (Today Peter works for NetApp.)

I love how NFS has evolved over the past couple of decades. Twenty years ago, it providing file sharing to small engineering workgroups; today it provides the data backbone for some of the world’s largest data centers. What it is about NFS that has allowed it to make this transition? What is it about NFS that Oracle would choose to build it directly into their database? That’s the topic for another post!

July 27, 2007

Green Data Centers and Solar Villages (Change is Most Likely When Heart and Wallet Align)

The Solar Electric Light Fund (SELF) provides solar lights to poor villages in developing countries. The trouble with solar is that it doesn’t work at night. (D’oh!). Off-grid solar is impractical for most first world houses because it takes expensive batteries to run 100-watt light bulbs and big TVs. When the competition is a dim and smoky kerosene lamp, small/cheap batteries work just fine. The payback is surprisingly fast; villagers already pay $5-10 a month for kerosene. The unexpected result is that solar power today is economically feasible for poor rural villages, but not for first world homes. Just as some developing countries have gone straight to cell phones, skipping landlines, rural villages may skip the power-grid and go straight to solar.

I must be a capitalist at heart: I love that people often want to do the right thing, but I believe that large-scale change is much more likely when supported by good economics. SELF’s approach is so powerful because using solar instead of oil feels like the right thing, but they have improved the odds of success by focusing where there is a positive return on investment (ROI). Once they show the way, their approach should become a virtuous circle, spreading rapidly without more charity. SELF gets the ball rolling, gets a local industry going, and then moves on to the next country.

I believe that a similar dynamic will drive power savings in corporate data centers. In theory, corporations may want to do the right thing by running green data centers, but it’ll be the economic benefits that drive large-scale change. This has been such a hot topic lately that the EPA – under direction from congress – is about to release a report on data center energy efficiency. In drafting the report, the EPA was interested in hearing what NetApp did to save power in our Sunnyvale data center.

Last year we did a major project to improve data center power efficiency. We increased storage capacity and performance, while achieving these results:

  • 80% reduction in power (329kW to 69kW)
  • 80% reduction in rack space (25 racks to 5.5 racks)
  • 60% improvement in storage utilization (under 40% to about 60%)
  • $1 million direct savings from reduced energy cost and PG&E rebates
  • $1.5 million additional savings expected over 18 months

How? The short answer is that we upgraded to newer more efficient hardware, and we used advanced features in Data ONTAP 7G to improve storage utilization. (For more details, see this case study, this report, and this blog.)

This was an easy project for us to justify, because it had sales and PR benefits. We were showing our customers how NetApp equipment, properly deployed, can save power. But never mind the sales benefits, the savings alone justify the project. I haven’t even mentioned savings from not having to expand our data center. We were approaching full capacity, but now we’ve got space/power/cooling to spare.                                                                  

One of my frustrations with capitalism is that – on average – corporations seem much less interested in doing what’s right than individuals. (Perhaps spreadsheets and PowerPoint presentations somehow inhibit moral behavior. Topic for another blog.) But in this case, I’m confident that the right thing will happen anyway, because the economic benefits are so strong. When projects are green in the wallet sense, as well as the environmental sense, they are much more likely to get funded.

 

July 20, 2007

Lies, Damned Lies, and Benchmark Results (The Ferrari versus The School Bus)

If Mark Twain were alive today, he might revise his famous quote:

There are three kinds of lies: lies, damned lies, and benchmarks.

It’s not that benchmarks are inherently bad, any more than statistics – the subject of Twain’s original quote – are inherently bad, but for both benchmarks and statistics, you need to understand the details pretty well to discern their message. For benchmarks, understanding the interaction between latency and throughput is particularly important.

Good latency (or response time) is like a Ferrari. Never mind how many people it holds, you sure do get to your destination in a hurry.

Good throughput is like a bus. Never mind how fast it goes, you sure can take a lot of kids.

The combination of good latency and good throughput is like a jumbo jet. You don’t always have to choose between speed and capacity.

When reading benchmarks, people often jump to the “big number”, the maximum operations per second. If you care about speed, this is a big mistake! It’s like choosing a racecar based on how many kids it can hold.

I really like how the SPECsfs benchmark reports data. It runs a series of tests at increasing load levels, and measures the response time at each level. Instead of comparing the maximum ops, I focus on how many ops a system can perform at less than 1 millisecond response time. (Ten years ago, I used a 10 ms cutoff. Today that’s too slow to be useful.)

Here are a couple of typical SPECsfs results pages: one for the NetApp FAS3070A and another for the EMC Celerra NS80G. If you simply look at the maximum ops, you get one story. The EMC maxes out at 86,372, and the NetApp at 85,615 – less than 1% difference. But if you look at the peak response time, NetApp (at 2.9 ms) is almost twice as fast as EMC (5.3 ms). That’s a misleading comparison, though, because the EMC really spikes up in the last point, but it’s not too bad up till then. As I said above, I think the best metric is to compare the 1 ms cutoff: 34,277 for EMC and 42,659 for NetApp. For “fast ops”, NetApp has a 25% advantage.

Depending how you measure, the two systems go from a tie, to EMC almost twice as slow, to NetApp doing 25% more ops. Benchmarks can make your head spin.

The difference is even more extreme if you compare against the BlueArc Titan 2200. The Titan maxes out at 98,131 ops, which is about 15% more ops than the FAS3070, but at the 1 ms cutoff, the FAS3070 does over twice as many ops. The Titan does more ops, but it does them slowly, school bus style. (If you want jumbo jet performance, check out the GX Cluster, which came in at over a million ops max, with over a third of a million at the 1 ms cutoff.)

The lesson is not that benchmarks are bad! The lesson is that to understand benchmarks, you need to understand what matters to you – matters for your particular environment. You can learn lots from a good benchmark, but you must dig deeper than the one big number. (See also this entry.)

[NOTE: SPECsfs doesn’t report the 1 millisecond cutoff directly. I calculate it based on a linear interpolation of the point just above and just below. For apples-to-apples comparisons, I used only results for NFSv3 over TCP.]

May 23, 2007

How Data De-Duplication Fits into our Master Plan

Let me explain how our data de-duplication announcement this week fits into our long-term strategy. One blogger described our goal as making Data Domain the "next entrée on NetApp's dinner plate". Actually, de-dupe is part of a much higher-level strategy.

To summarize the announcement, we now support data de-duplication on all of our storage systems. (It takes a license.) If the same block of data is present in two different LUNs or files, the storage system spots this and saves space by keeping just one copy. For two years this functionality has been available for backups using SnapVault for NetBackup, but now people can enable de-dupe for any data on any NetApp storage system.

In some cases, like nightly backups of the same data, de-dupe can yield compression ratios as high as 50-to-1, although 10-to-1 or 20-to-1 are more common. Other cases, like user home directories, may save 40% or less. It all depends on how redundant the data is. De-dupe helps customers buy less storage, use less power, cooling, and floor space in their data centers, and – in the end – save money. (See here to understand why helping customers buy less storage is a good strategy for NetApp.)

Buying less storage is the small picture. The big picture is that we want to help customers create a disk-based copy for all of their primary storage.

Many customers already create disk-based copies for mission critical data, to ensure business continuity in case of disaster, but we believe the trend is to create disk-based copies for everything. Tape-based backup just isn't keeping pace with improvements in disk drives. Plus, compliance and discovery for litigation are creating new requirements that tape drives could never meet.

Interesting things start to happen when you create a disk-based copy of everything. Instead of doing searches on primary storage, which could hurt performance, why not search the secondary copy? If the people running decision support systems want their own copy of a critical database, why not clone the secondary instead of paying for a whole new copy? Why not create lots of cloned copies for the test and development team preparing to upgrade to the next version of Oracle or SAP? When you create a copy of everything, and add functionality like snapshots and clones, what you end up with is a smart copy infrastructure that can completely change the way you think about data management.

This won't happen overnight. We understand that. But anything that helps people reduce the cost of creating copies helps us achieve our vision more quickly. In the short run, data de-duplication helps customers save space and save money, but what's more important is that by reducing the cost of copies, it helps us achieve our master plan.

May 17, 2007

I Was Interviewed by Frontier Journal – Hear it On-Line

I was recently interviewed by Ed Zhang at the Frontier Journal. Here is the mp3.

You can hear why venture capitalists wouldn't fund us until after we actually had paying customers. (Angel investors funded us all the way to first product ship.) You can hear about Mike Malcolm who is the third founder of NetApp, along with James Lau and I. (Mike left in 1995, after we hired Dan Warmenhoven to replace him as CEO. Mike is a genius and is still doing interesting stuff.) I talk about my role at NetApp as "company philosopher".

They also have a collection of interviews with Steve Wozniak, Vinton Cerf, Richard Stallman, Jimmy Wales and many more, in case you don't want to listen to me.

May 11, 2007

Pop Quiz: Should You be a Late Adopter or an Early Adopter?

Technology follows a predictable adoption life-cycle. First the innovators try a new technology. They are crazy and will try anything, just for the fun of it. Then come the early adopters. They typically have a problem so hard that they must take risks to solve it. Next are the early majority, the late majority, and finally the traditionalists – the “quill pen” folks, who avoid change if at all possible.

Startups are full of people who love new technology, and they assume that customers do as well. The reality is that many companies, especially large enterprises, would rather avoid change, and for good reason. Change is disruptive. Change upsets people. Change requires new expertise. Change might fail, and it often costs more than you expect.

There are lots of good reasons to be a late adopter, so I’ve come up with two questions to help customers figure out where they should be on the adoption curve:

    Question #1: Can existing products do what I need?

If you have no problem, why change? High-tech companies always seem to have new problems. They want to simulate a bigger chip than ever before, or render more orcs on a battlefield, so they are comfortable being early adopters, and they have learned to do it well. Part of their business model is to manage the risks of early adoption.

Low-tech companies occasionally have problems that existing products can’t solve. For instance, people are considering disk-to-disk backup because tape-based backup isn’t keeping up with the growth in storage. The problem is, many companies are not used to being early adopters. It makes them uncomfortable, and they don’t have the skills to manage it well.

    Question #2: Are IT costs too high?

If existing products do what you need, and IT costs are not too high, why on earth would you change? When considering IT costs, it often makes sense to ask what percentage of total costs are IT related.

Suppose you are an oil company, and you have spent $12 billion on an oil refinery. Now you are trying to figure out whether to spend one million dollars on the IT infrastructure to support it, or ten million. Without knowing anything else about the problem, I can tell you the answer. Spend ten million; spend twenty; who cares! That is such a low percentage of the overall budget that it doesn’t matter. Just don’t ever let that refinery go down. Or chip fab. Or battleship.

On the other hand, consider Oracle’s On Demand business. Instead of buying Oracle software and running it on their own equipment, customers outsource to Oracle. Oracle buys the servers, buys the storage, and manages the software. The Austin Data Center where Oracle does this is the largest installation in the world of Dell/Linux and NetApp storage. I don’t know what percentage of their overall costs are IT related, but I know it’s huge. Existing products could solve their problems, but the cost would be exorbitant, so Oracle has been very aggressive with technologies like NFS and Linux for enterprise applications. They have led the way in making these technologies safe for others.

What’s most important is that you make a conscious decision between early and late. Sometimes early is better, and sometimes late is better, but at least you should ask the question! Use my two questions to figure out what makes sense for your business, or your particular project.

A big part of NetApp’s strategy is to enable customers to change when it’s convenient for them. Unified storage is a great example. Our storage systems support Fibre Channel SAN along with NFS and iSCSI, which means that customers can start with traditional SAN for their database applications, if that’s what makes them comfortable, but they can change to Ethernet storage whenever it makes sense. (Ethernet storage often brings significant savings.) Alternately, they can start with NFS or iSCSI, secure in the knowledge that they can easily upgrade to Fibre Channel SAN if their requirements become more demanding.

By doing our innovation in a single, unified storage architecture, we create an environment that lets customers adopt new technology when they are ready. Early if that is appropriate, or late if that makes more sense. Either way, NetApp’s approach makes it easy, because the new technology is part of the same architecture that the customer has already installed.

In other words, NetApp enables change, but we don’t force change down our customers’ throats.

May 04, 2007

My Offensive iSCSI Blog (My Philosophy of Communication)

It seems my blog on whether iSCSI is SAN or NAS is offending people:

 

In that blog I said, “Many technical people are offended by the idea that iSCSI might be NAS.” Sure enough! Here are two technical people offended by my post. (At least I know my audience.)

In some ways, that blog was more about a philosophy of communication than about iSCSI.

When I’m communicating badly, it’s often because I don’t understand how my audience thinks. The words and ideas I’m using don’t mean the same thing to them. But if I take the time to hear and understand their worldview, then I can speak their language and communicate better. Sometimes I even change my own worldview.

That blog is the story of what happened when I took the time to understand how business people think about their storage.

When I first encountered business people who thought that iSCSI was NAS, I reacted just like Mario and Marc: I thought they were idiots and I wagged my shaming finger at them. But when I learned that their worldview is based on infrastructure, capital expenses, and organizational structure, I realized that to them, iSCSI really is more like NAS than SAN.

The idea of iSCSI as SAN is incompatible with their worldview, and if I speak that way, I will confuse them. I might accept Marc’s finger of shame if I were harming my audience, but in fact I am helping them by using words in ways that clarify and not mislead.

Does this mean that business trumps technology? That we should categorize iSCSI as NAS? No! That would be equally misleading to the technical people. iSCSI is a block-based protocol, so in terms of how you manage it, and how it interacts with applications – key parts of the technologist’s worldview – iSCSI is very much like SAN.

I consider business and technology to be equally important. Since categorizing iSCSI either way will be confusing, I refuse to categorize it at all. I simply list all three names instead of just two: SAN and NAS and iSCSI.

Plato had advice on this subject: “Why should we dispute about names when we have realities of such importance to consider?”

Sometimes I think that technical people do themselves a disservice, when talking with business people, because they focus too much on technical details that don’t matter to their audience, instead of focusing on the issues that do. Don’t waste time on irrelevant categorization; spend time on how to save money with iSCSI. In this case, audience trumps speaker.

April 27, 2007

Is Data ONTAP Based On UNIX?

A customer recently asked, “Is Data ONTAP based on UNIX?” Complicated question.

The first version of Data ONTAP borrowed lots of code from Berkeley Net/2 (one of the earliest open-source releases of UNIX), including the TCP/IP stack, system boot code, and device drivers. Since then, we’ve borrowed liberally from other open-source UNIX releases. We wrote the command line interface from scratch, but we designed it to look like UNIX, since our first market was UNIX system administrators. Clearly, ONTAP is related to UNIX.

On the other hand, ONTAP’s architecture is very different from UNIX. There is no user-space, the filesystem is completely different, the RAID and diskubsystems are completely different and most important of all, the interaction between subsystems is very different. The key data paths from network to disk look nothing at all like UNIX.

You can imagine two completely different ways of building an appliance. You could start with UNIX and strip out, disable, or hide the pieces you don’t want. sOr else you could start from scratch, inventing a new architecture optimized for the task at hand, but borrowing liberally from the UNIX code-base as appropriate. We chose the latter.

Interestingly, our advanced ONTAP GX architecture is built on top of a full UNIX release. We took Data ONTAP, including WAFL and RAID, combined it with the new code from our Spinnaker acquisition, and hosted the combined result on FreeBSD in a combination of user processes and kernel modules. For security and simplicity we have disabled and hidden many parts of FreeBSD.

Even for ONTAP GX, this isn’t quite the “start with UNIX and slim it down” approach, because most of the system lives in large kernel modules that kick UNIX out of the way. They grab control of almost all of the memory, as well as the critical device drivers, and we even re-wrote the scheduler to make sure the UNIX parts don’t get in the way. We’re still not using the normal UNIX data paths.

Why the difference? One reason is that CPUs are way more powerful now than when we started, so a little bit of extra overhead matters less. (Our first product used a 50 megahertz 486.) In addition, UNIXs have gotten much better. When we started, there was no Linux, no FreeBSD, and AT&T and UC Berkeley were still in a legal battle over big chunks of the Berkeley Net/2 Release.

Although these two approaches feel very different to engineers developing the systems, each with its own advantages and disadvantages, there is little if any difference to the end-user.

April 19, 2007

Is iSCSI SAN or is iSCSI NAS? I Don’t Know.

Many customers wonder whether iSCSI is a type of SAN or a type of NAS. I used to know, but not any more.

The two big differences between NAS and Fibre Channel SAN are the wires and the protocols. In terms of wires, NAS runs on Ethernet, and FC-SAN runs on Fibre Channel. The protocols are also different. NAS communicates at the file level, with requests like create-file-MyHomework.doc or read-file-Budget.xls. FC-SAN communicates at the block level, with requests over the wire like read-block-thirty-four or write-block-five-thousand-and-two.

If you think the protocol is more important, then iSCSI is like SAN; if the wire is more important, then iSCSI is like NAS.

Technical people know that the protocol is more important; it determines how the compute server talks with the storage. With FC-SAN, a filesystem like UFS, VxFS or ZFS runs on the host and converts file requests into the block requests that are sent over the wire. With NAS, the host sends file requests over the wire, so a filesystem must run in the storage system. There are advantages and disadvantages to both approaches, but the point is that from an architectural perspective, iSCSI looks just like FC-SAN. The filesystem runs on the host and sends block requests over the wire. (Many technical people are offended by the idea that iSCSI might be NAS.)

Business people focus on infrastructure, budgets, and org charts, so they worry about wires. Choosing NAS over SAN for Oracle, Exchange, or SAP affects the capital budget for Ethernet versus Fibre Channel, and it can even affect organizational structure. Sometimes an “Apps/Servers/Storage Group” owns Fibre Channel, while Ethernet belongs to a “Distributed Infrastructure Group”. Is the Apps group allowed to buy and manage their own Ethernet switches if they decide to run Oracle over NAS? They may argue, “We should own the switches between server and storage.” The Distributed Infrastructure group may argue, “We own all TCP/IP networking,” but the corporate network may not offer the bandwidth or quality of service required for Oracle-on-NAS. I’ve seen CIOs do reorgs over these issues. Business-wise, iSCSI looks just like NAS, so business people often assume that iSCSI is a form of NAS.

I used to know iSCSI was SAN, and I lectured business people about why – technically speaking – they were wrong. But if I accept that technical and business perspectives are equally important, then I really can’t categorize iSCSI. Plus, many people still think of NetApp as “the NAS company”. Gartner’s SAN Magic Quadrant has us tied with EMC for first place in SAN, but perception is what matters, so it’s great for NetApp if customers think iSCSI is NAS. Why argue?

Our President Tom Mendoza taught me this: “Don’t waste time in sales calls trying to convince customers that they are wrong.”

Now, when I want to talk about the whole market, I just say: SAN, NAS and iSCSI. I don’t know whether iSCSI is SAN or NAS, so I list all three. If a customer presses me, I’ll say, “From a technical perspective, iSCSI looks more like SAN, but in terms of business issues, it looks more like NAS.” I’m no longer interested in debating the issue.

EMC’s Celerra Simulator (I Eat My Words)

A while ago I bragged about NetApp’s simulator, how useful it is for customers, and how no other storage vendor has anything like it.

I received a message from Chad Sakac at EMC who told me, “We have a Celerra simulator, and it’s great for all the same reasons that your simulator is.”

As Chad and I got to chatting, I said that I thought EMC was a great competitor, and that NetApp and EMC were probably good for each other, because competing kept us on our toes. Here was his response:

    For what it’s worth – personally, certainly I don’t necessarily represent EMC in this context – I appreciate having as formidable a competitor as Network Appliance. With other competitors, we compete with them, and it’s a slow waltz, but with you folks, it’s a tango. It makes us better, win or lose with any individual customer, and in the end, it’s good for the industry and the customers.

I completely agree. I feel the same way about EMC.

Anyway, I replied that if the simulator was easily available to customers on the web, like ours is, then I’d retract my words and post a link. That shut him up for quite a while, because it wasn’t easily available anywhere, but I think my taunting gave him ammo to get things moving at EMC. He just sent another message saying that the Celerra simulator is now available on the web.

The URL is only accessible to EMC employees and partners, but Chad says that they are allowed to give the simulator to customers, so if you want it, and your EMC contact can’t find it, give them this link:

http://powerlink.emc.com/km/appmanager/km/secureDesktop?
_nfpb=true&_pageLabel=query1&internalId=0b01406680224db3&_irrt=true

Helping make EMC’s customers happier with EMC equipment isn’t normally the goal of my blog, but if that’s an unintended consequence here, then so be it. To really be of service to EMC’s customers, I suppose I’m now on the hook to find simulators for DMX, CLARiiON, and Centera. :-)

March 22, 2007

Power in the Data Center: To Put a Watt In, I Must Take a Watt Out

It’s interesting talking to customers about power in the data center, because they have such wildly varying perspectives. Some say that power doesn’t really matter at all, and they wonder what the big fuss is. Others say power is the single most critical issue in their data center, and they are surprised that anybody might not agree.

The folks who say power doesn’t matter either haven’t thought about it much, or else they tell me that they’ve done the math, and the cost of keeping their spindles spinning just isn’t that high compared to the cost of buying and managing the storage.

The ones who say power is critical typically can’t put any more power into their data center. In some cases they literally can’t get more power – the power company won’t sell them any more. In other cases, they’ve hit limits on the wiring or the cooling in their data center.

A financial customer in New York explained it best: “We’re at 100% of power capacity today. For every new watt I bring in, I’ve got to figure out how to take one out.” He was very interested in upgrading to new storage systems that consume fewer watts-per-terabyte.

He was also interested in VMware, since that often drives large power savings. (See this blog on how one customer used VMware to reduce power by 450 kW/month.) In most data centers, servers consume more power than storage, so most people start there, but consolidating storage is the obvious next step.

There are many ways that storage companies can help you reduce power, but – surprisingly – more efficient hardware is low on the list. We all take about the same power to keep a spindle spinning, because we all use pretty much the same disks, power supplies, processors and so on.

On the other hand, it takes roughly the same power to run a 144 gigabyte FC drive as a 750 gigabyte ATA drive, so using the largest drives possible is a great way to save. To use ATA drives for mission critical data, you’ll want a RAID that protects against double disk failures. Any feature that improves utilization will also reduce power. Use RAID instead of mirroring. Use thin provisioning. Use clones or snapshots instead of full copies. (For details, check this paper which has point-by-point recommendations for reducing storage power consumption. This paper describes what NetApp’s own IT team did to save power.)

To summarize, the biggest savings don’t come from hardware, but from software features that improve storage efficiency and storage utilization.

What I love about all of this is that self-interest actually drives customers to a greener data center. One of my frustrations with corporations is that economics often seem to trump “good citizenship”, so I love it when economics actually drive companies to do the right thing.

March 16, 2007

Analyst Day Vision Themes: "Application Integration" and "Smart Copies"

We had our annual analyst day in New York this week. That's when we bring in several hundred financial analysts and industry analysts and share our progress, vision and strategy. My focus was on our vision for the future—how we can direct innovation in a way that matters to our customers. I had two main themes.

The first theme was Application Integration. CIOs care much more about the applications that run their business than they do about their storage, so the best way to be relevant to the CIO is to provide the best possible data management environment for their apps. To put it another way: If the application is King, how can we make the King look good? You'll have a pretty good sense of what I talked about if you read blogs like Booth Duty at Oracle Open World: FlexClone is the Big Hit, Using Simple Pictures to Control Data Protection Policies, and Data Management and Automated Teller Machines.

The second theme was Smart Copies. This is a new layer of storage that customers create when they make a second copy of their data—usually as part of a disk-to-disk-to-tape backup scheme—and then use features like snapshots and cloning to get more business value from the second copy. Examples include long-term archives for compliance or clones to accelerate test and development for SAP and Oracle. Many of our customers are starting to create a "smart copy infrastructure" containing copies of almost everything in their primary storage.

Part of what makes our copies "smart" is that we make them so easy to create. For business continuance with mission critical data, create a synchronous copy that exactly replicates your primary storage. For less critical data, save money by putting the copy on inexpensive ATA drives, and by updating the copy at night when bandwidth is cheaper. Or update once an hour if you want. It's completely flexible.

We often brag that our unified architecture makes it easy for customers to choose between SAN, NAS and iSCSI, depending on what's best for the app, but I think it's equally important to offer a wide variety of data protection capabilities. With most storage arrays, the only option for replication is from one storage system to another that's just the same—like-to-like. With NetApp, it's easy to replicate from high-end SAN with 72 GB Fibre Channel drives to a much less expensive iSCSI system with 750 GB ATA drives. We even have tools (see here and here) to bring data from other vendor's primary storage into our smart copy infrastructure.

The other thing that makes our copies "smart" is features like snapshots and cloning. Backup or DR may be the reason you created the copy, but once you have the copy you can clone it to let more people access the data. A clone is a "virtual copy" that takes very little space, so it's fast and easy to create as many clones as you want. Or you can use snapshots to keep data for a long time. For compliance, make the snapshots tamperproof to ensure they can't be changed.

Clones are especially valuable for development and test in SAP and Oracle environments. Clones speed up test and dev in two ways. First, clones speed up the test cycle itself. Copying a multi-terabyte database is slow, but creating a new clone is instant. You can quickly create a clone, run a test, and check the result. If the test fails, fix the bug and try again. Second, you can afford to create lots of clones, since they don't take any extra space until you write to them. Real copies of a big database are expensive, so people have to share. With clones, it's cheap to create a copy for everyone. People are faster and more efficient when they can work in parallel.

In a way, you could say that application integration is all about making the first copy of data better (primary storage), and smart copies are all about making the additional copies better (secondary storage). When you put the two together, you get a very powerful model of data management.

March 08, 2007

Admire and Respect Great Benchmark Results, But Also Be Careful

I'm proud of our new midrange systems, the FAS3040 and FAS3070. Both have benchmark results that blow away the competition. (For detailed results, see this press release on the 3040 and this one on the 3070.)

From this position of strength, I believe it is an excellent time to acknowledge the downsides of benchmarks. Good benchmark results are valuable. High numbers indicate strong hardware and carefully tuned software. Increases within a single architecture (like the 3020 to the 3040) usually indicate real improvement. Still, real-world results can be different from what benchmarks predict, so customers must evaluate performance in other ways as well.

Here's an example. Years ago, NetApp and Sun did a performance bake-off at a large software development company, using their actual application. The results were fascinating. The SPECsfs benchmark result for Sun was ten times faster, but for this customer's workload, NetApp was four times faster. The benchmark was wrong by a factor of 40.

Sun sent in a team of Sales Engineers, and after a week of tuning they doubled the performance—still half as fast as NetApp. Then Sun called in the "big guns". One of their key NFS developers came in, and after another week of tuning, he matched NetApp's performance. The numbers matched, but it was a win for NetApp because we delivered the result on day-one with no tuning. The customer appreciated Sun's effort, but said, "Realistically speaking, they aren't going to send those guys out every time I install a new system, so I won't see that performance in my data center."

How could a benchmark be so wrong? SPECsfs is actually quite good, but there are two main reasons that benchmarks differ from the real world:
  1. Benchmark configurations don't always match your configuration.
  2. Benchmark workloads don't always match your application workload.
In this case, both were true. Sun had benchmarked an absolutely enormous config, which isn't what the customer got. And the customer's workload was very different from what SPECsfs measures.

Typically after any vendor announces good benchmark results, you'll see a series of he-said-she-said arguments about exactly these issues. Examples from the FAS3070 launch are here and here.

NetApp mitigates the first issue by benchmarking "realistic" configurations. We benchmark commonly-purchased hardware with normal features enabled, like Snapshots, RAID-DP, and FlexVols. Even though we test configs that many customers buy, it's not necessarily the config you will buy, so your mileage will still vary.

The fact that benchmark workloads don't match real-life workloads is harder to fix. One approach is to demand a vendor bake-off in your own environment, but few customers have the resources to simulate their full production workload. Alternately, you can press the vendor for case studies or references from customers like you, running the same application at roughly the same scale.

It's important to follow vendor best practices for your app. We've got folks in our lab who know how to configure EMC systems to run really slow, and they have folks in their lab who know the same for NetApp. Pay no attention! Focus on results from configurations the vendor recommends. (Do check that the recommended config has the features you plan to use. Many features can hurt performance.)

Benchmarks are valuable, despite some flaws, but you must read between the lines to understand the true message. Are commonly used features enabled? Is data protection turned on? Are LUNs created in unusual ways? One trick I've seen is to create LUNs that span many disks, using just a small sliver of each one, with no RAID protection enabled. Nobody would ever configure a real-world system that way. In other words, poke at how the benchmark config differs from what you plan to buy.

In conclusion, having established my dispassionate honesty by taking the high road and acknowledging that benchmarks aren't perfect, let me summarize by saying: The FAS3040 and the FAS3070 really scream. Check them out!

March 02, 2007

Using Simple Pictures to Control Data Protection Policies

In Data Management and Automated Teller Machines, I described a vision of data management. The gist was that application administrators ought to be able to provision and manage data themselves, without bothering a storage admin, just as I can get cash from an ATM myself, without waiting for a bank teller.

ATMs are only safe because banks have policies that detect problems and determine how much cash I can withdraw at a given point in time. Likewise, our ATM vision of data management requires tools to let storage admins easily define data management policies.

Our new Protection Manager focuses on policies for data protection. A policy is a rule that describes how to protect the data. The idea is to let storage admins reflect the corporate rules, guidelines or SLAs (service level agreements) independent of specific NetApp technology. A policy can say "make copies every week and keep them for at least a year" or "retain undeletable copies for seven years." Our automation engine evaluates which technologies are available (has the customer licensed SnapVault? SnapMirror? SnapLock?) and connects the plumbing in a way that satisfies the policy's goals. Over time, the engine monitors whether the data conforms to the policy's goals. The key point is that you can tell the Protection Manager your goals and let it figure out the details.

Protection Manager lets you define policies in a graphical, intuitive way. A simple picture represents the policy. An icon on the left side represents the primary storage, and one or more icons on the right represent copies of the data. Arrows between primary and copy show the type of copy. Click the diagram to edit how and when the transfers should happen. Should a mirror update once an hour, or just at midnight? Is the backup window open all day, or only at night? How many primary copies should be retained and how many backup copies? The tool isn't just about backups and snapshots. Our plan is to also support the undeletable and unalterable copies required to comply with government regulations.

After you have defined exactly how the policy works, you can give it a name. Maybe "Gold" means an offsite mirrored copy updated throughout the day plus a year's worth of backup copies, "Bronze" means one backup a day at midnight kept for just one week, and "SEC-17A" means unalterable and undeletable copies kept for 7 years.

You can apply a policy to a single volume or LUN, but you can also apply them to a user-defined group called a dataset. If you have a large number of LUNs that all support the same application, you can group them together in a dataset and apply the policy to the dataset as a whole.

The idea is that instead of worrying about hundreds or thousands of mirroring relationships for hundreds or thousands of LUNS and volumes, you can define a handful of policies, group your data into a much smaller number of datasets, each of which gets the appropriate policy. Another benefit is that defining standard policies makes it easier to deliver storage broadly as a service within a company. Formalized policies lay the foundation for execution, predictability.

We don't yet allow application admins to set protection policies on their own, but that is the next step. Our plan is to add these features to our own application integration tools, like SnapManager for Oracle, but we understand that not everyone uses those tools, so we are also offering APIs so that we can incorporate these capabilities into frameworks like Oracle Fusion, Microsoft .Net, or SAP NetWeaver.

We haven't yet achieved the full vision—to be honest not even close—but I think we are ahead of most vendors. Others have talked about this kind of model for data management, but we have a big advantage because we have a unified architecture that spans our whole product line: primary to secondary, high-end to low-end, and SAN to NAS to iSCSI. Our storage management team can focus on cool new features instead of on how to make incompatible architectures—like DMX, Clariion, Centera and Celerra—look more or less the same.

December 21, 2006

The Pain of Tape-Based Backup—Disruptive Technology and the Red Queen

I love Clayton Christensen's book, The Innovator's Dilemma. I discussed his theory of "disruptive technology" in a blog entry about whether iSCSI is disruptive. (It isn't now but could be eventually.)

Anyway, I recently noticed an odd sort of exception to some of Christensen's rules in the backup market.

Christensen's first rule is that in most markets, user requirements go up a bit every year, but not too fast. If you buy a new car, you hope it's a bit better than your current one—maybe a bit faster, a bit better mileage, or more airbags to make it safer. But mostly you buy a new car to solve the same problems as the old car, so the requirements are about the same.

Christensen's second rule is that technology improves faster than customer requirements, especially in high tech. Ten years ago, my PC was slow and the disk was always full. Today my laptop has CPU to spare and 33 GB of free space. Christensen calls this a goodness oversupply. There's nothing wrong with a goodness oversupply except that customers won't pay for it. I don't want a faster laptop. I'd rather have it lighter and more efficient so that the batteries last longer and it doesn't light my lap on fire.

Summary: (1) User requirements go up slowly; (2) Technology improves quickly.

Together these rules set the stage for a disruption. That's when a low-end technology gets good enough to attack the higher-end: UNIX computers attack mainframes or PCs attack UNIX. I first became interested in Christensen's theory when I saw that it applied to NAS and SAN. Since then, his observations have become a key foundation for my strategic thinking.

What's so interesting about the backup market is that tape-based backup technology is not keeping pace with customer requirements. People struggle to meet backup windows they used to hit. Why are Christensen's rules failing?

The trick is that human behavior usually drives requirements, and humans just don't change that fast. But sometimes technology trends drive requirements, and then requirements rise just as fast as technology. In the case of storage, the capacity you get keeps doubling even if the corporate budget for storage remains flat. That, in turn, doubles backup requirements—twice as much data to move in the same backup window.

You could argue that human behavior must be changing in order to fill all that new capacity, but I believe technology trends are the ultimate driver. Faster computers generate more data. High-res cameras generate more data. Faster networks carry bigger e-mail attachments. Storage also has the full closet problem. My closet—however large or small—is always full. I never bought a new house because of closet space, but when I get a bigger one, it quickly fills. My behavior didn't change at all—I throw stuff away when my closet gets full—but now I have more stuff. Disks are the same. If disks stopped growing, corporate storage budgets would not double every year—not for long. So you can see, technology is what's changing, not behavior.

This is my update to Christensen's rule: When human behavior drives requirements, then technology improves faster than requirements. But when technology itself drives requirements, then you only keep even.

This reminds me of the Red Queen in Alice in Wonderland:
"In our country," said Alice, still panting a little, "you'd generally get to somewhere else—if you run very fast for a long time, as we've been doing."

"A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"
The strange country of tape-based backup is even worse; you lose ground while running as fast as you can. That's because disks get bigger faster than tapes get faster. Switching to disk-based backup helps because now you stay even rather than losing ground, but to actually make forward progress requires serious innovation.

I think these observations explain why backup and data protection have been such hot areas for innovation. Based on the observations in this blog, I expect that to continue for the foreseeable future.

November 30, 2006

Simulate NetApp Storage on Linux (My Boss Won't Buy Hardware for Me to Play)

Several years ago I started hearing this complaint from system administrators who used NetApp equipment:
I'd like to try out the features in your new ONTAP release, but all my systems are in production, and my boss won't buy me hardware just so I can experiment.
To solve this, we released the "ONTAP Simulator". The simulator is a complete version of ONTAP that runs as a process under Linux. Instead of using real disk drives, it opens files with names like disk.1, disk.2 and so on. Instead of having a real network card, it hijacks the Linux Ethernet driver to grab the raw packets. (Any customer can download the simulator for free from the Simulate ONTAP web page.)

This is a great tool for learning about ONTAP. Try a new feature, or try different features in combination to make sure they work as expected. You can even delete one of the disk files to see what happens when a disk fails, or edit the disk file with a binary editor to see what happens when there is random disk corruption.

We also use the simulator in our training classes. We no longer need thirty systems for thirty students. We use a couple of real systems—for things like swapping a disk drive, where you absolutely need the real hardware—but mostly students use the simulator. That cuts down on noise in the classroom, saves power, and reduces shipping costs for offsite classes.

At first people just played around, but some customers have gone way beyond that. They use the simulator to verify complex configurations involving multiple systems with clustering, remote mirroring and long-term data vaulting—all with multiple ONTAP simulators running on a single Linux PC. Customers tell me that it's much faster and easier to find configuration errors with the simulator than it would be with real hardware, and they can do it before they buy and install the real hardware. People also use it to get comfortable with a new release before upgrading, to verify that it works in their network environment and to test out their management scripts. Customers have even found ONTAP bugs. Except for low-level drivers, the simulator runs exactly the same code base as production ONTAP, so you see exactly how a real system will behave. (There are some differences: the simulator is slower, it has a hard limit on disk capacity, and we don't have simulated Fibre Channel drivers, so for block storage you have to use iSCSI.)

The simulator is as old as ONTAP itself. As a small startup, we created the simulator before we had any real hardware. Later, engineers found that it was often faster and easier to fire up a simulator than to download a new OS to a real system. Also, the debugging tools were better for a local process than for systems. We used the simulator in engineering for almost 10 years before anyone figured out that it would also be a great tool for customers.

At this point, many thousands of customers have downloaded the simulator. Maybe we've lost a few system sales as a result, but I'm sure that most of those people wouldn't have bought extra hardware just to "play around". Besides, I believe that anything we do to help system administrators to get comfortable with our storage systems is a benefit in the long run, even if we do lose a few sales in the short run.

As far as I know, NetApp is the only storage vendor with anything like this.

November 14, 2006

Follow-up on VMware: Both Better and Worse Than I Described

This is a follow-up to my recent blog on How VMware is Revolutionizing Data Centers. Two weeks ago at Oracle OpenWorld, I talked with one customer who detailed exactly what he's accomplished so far with VMware. A second customer explained why VMware won't help his application at all.

The first customer—who asked me to describe his company as "a large UK based Telco"—used VMware to consolidate 502 windows systems onto 25 blades. He freed 173 racks worth of space. He cut power by almost 450 KW per month and reduced his power bill by $50,000 per month, not to mention reduced service and support. In all, he expects to clear 345 racks and replace them with 20 racks, with a full return on investment in less than a year.

He runs a variety of applications: some web servers, some internal databases, some billing applications—all sorts of different stuff, mostly on Windows. On average, the systems were very lightly loaded, many at 5-10% CPU utilization or less. Some of his test and development environments run as many as 50 virtual machines on a single windows server. Production environments may run 5 or less. It all averages out to about 20. What holds the consolidation ratio down in production environments is fear of how many users a failure would affect. He expects the ratio to go up as VMware business continuance tools mature and as application administrators gain confidence.

That is the success story. The other customer explained why VMware won't help him at all. He runs a large internet site with hundreds of web servers—lots of Linux and Apache. He has a load balancing methodology that lets him saturate his servers. (In his case, the systems are actually memory limited, because they cache the most commonly used data. In other environments, CPU is the limiting factor.) He argues—correctly in my view—that VMware wouldn't help him consolidate at all, because he has no spare capacity. VMware's management capabilities could be of some use, but he already solved those problems, so he isn't looking for any help there. (Remember Tom Mendoza's rule of sales: "Customers don't open their wallets unless they are in pain." Wise salesmen save their own time, as well as their customer's.)

Server virtualization is like thin provisioning. Both allow you to hand out resources that you don't really have. With one, you hand out ten one-terabyte LUNs, even though you don't have that much real disk space. With the other, you hand out ten virtual servers even though you don't have ten real servers. (See this blog entry where I argue that thin provisioning is like writing bad checks.) Both tools help you drive up utilization, but that's only useful if utilization was previously low! If utilization was already high, there is no gain to be had. If your users immediately fill their LUNs, then you'd better have the real storage. If your applications peg their servers, then you'd better have real servers.

Thin provisioning, like VMware, performs especially well in data centers with many lightly loaded windows servers. Admins often have no idea how much storage these low-utilization apps really need, so to avoid the hassle of expanding capacity later, they request plenty extra. This leads to wasted space and low utilization. The reason customers keep talking to me about VMware is that they keep noticing the synergy between server consolidation and storage consolidation.

Here's an interesting postscript on the "large web site" customer. Two weeks ago he told me about the application that VMware won't help, but last week at a second meeting, I learned that he also has hundreds of lightly loaded Windows servers running random internal business apps. He is just starting to look into server consolidation. Ironic. My prototypical example of a customer that VMware won't help may soon be running it in a different part of his data center.

November 03, 2006

Oracle and Red Hat

I was in the audience when Larry Ellison announced that Oracle will be offering enterprise-class support for Red Hat Linux. Oracle, of course, has made a big bet on Linux as the preferred platform for running Oracle. (Oracle's big data center in Austin has over 20,000 nodes of Linux. (It's also the single largest NetApp installation in the world.))

Larry argued that three key issues are slowing Linux adoption:
  1. No "true enterprise support" available.
  2. Support costs too much.
  3. Threat of lawsuits from SCO.
Larry's solution: (1) Oracle will support Red Hat Linux, (2) at much lower prices than Red Hat, plus (3) they'll indemnify customers against SCO lawsuits.

This clearly makes strategic sense for Oracle. When NetApp was small, one of our biggest challenges was convincing customers—especially large customers—that we could support them effectively. Multi-billion dollar companies like to do business with multi-billion dollar companies. Red Hat is still under half a billion in revenue. I know that support from Oracle will feel safer, especially for giant customers running their business on Oracle, so I think Larry is right that this will speed Linux adoption.

It also gives Oracle more complete control of the software stack all the way down to the hardware. One audience member asked, "Should we expect to get a complete stack from Oracle now, from the operating system all the way up through applications?" Larry's answer: "Absolutely."

To avoid confusing the Linux market by fragmenting the code base, Oracle will patch Red Hat code, but will resync after major Red Hat releases. I saw a potential tension here: Oracle will depend on Red Hat for distributions, but at the same time they are undercutting Red Hat's prices, which could obviously hurt them. So I asked, "What happens to Red Hat? Is killing them an unintended side effect, or do you have a plan to help keep them alive?" In retrospect, it was a stupid question, because Oracle can always just buy Red Hat if they want to keep them alive.

Larry's answer was interesting: "This is capitalism. We're competing. We're trying to offer a better product at a lower price." On the other hand, he also said, "I don't