« October 2006 | Main | December 2006 »

November 2006

November 30, 2006

Simulate NetApp Storage on Linux (My Boss Won't Buy Hardware for Me to Play)

Several years ago I started hearing this complaint from system administrators who used NetApp equipment:
I'd like to try out the features in your new ONTAP release, but all my systems are in production, and my boss won't buy me hardware just so I can experiment.
To solve this, we released the "ONTAP Simulator". The simulator is a complete version of ONTAP that runs as a process under Linux. Instead of using real disk drives, it opens files with names like disk.1, disk.2 and so on. Instead of having a real network card, it hijacks the Linux Ethernet driver to grab the raw packets. (Any customer can download the simulator for free from the Simulate ONTAP web page.)

This is a great tool for learning about ONTAP. Try a new feature, or try different features in combination to make sure they work as expected. You can even delete one of the disk files to see what happens when a disk fails, or edit the disk file with a binary editor to see what happens when there is random disk corruption.

We also use the simulator in our training classes. We no longer need thirty systems for thirty students. We use a couple of real systems—for things like swapping a disk drive, where you absolutely need the real hardware—but mostly students use the simulator. That cuts down on noise in the classroom, saves power, and reduces shipping costs for offsite classes.

At first people just played around, but some customers have gone way beyond that. They use the simulator to verify complex configurations involving multiple systems with clustering, remote mirroring and long-term data vaulting—all with multiple ONTAP simulators running on a single Linux PC. Customers tell me that it's much faster and easier to find configuration errors with the simulator than it would be with real hardware, and they can do it before they buy and install the real hardware. People also use it to get comfortable with a new release before upgrading, to verify that it works in their network environment and to test out their management scripts. Customers have even found ONTAP bugs. Except for low-level drivers, the simulator runs exactly the same code base as production ONTAP, so you see exactly how a real system will behave. (There are some differences: the simulator is slower, it has a hard limit on disk capacity, and we don't have simulated Fibre Channel drivers, so for block storage you have to use iSCSI.)

The simulator is as old as ONTAP itself. As a small startup, we created the simulator before we had any real hardware. Later, engineers found that it was often faster and easier to fire up a simulator than to download a new OS to a real system. Also, the debugging tools were better for a local process than for systems. We used the simulator in engineering for almost 10 years before anyone figured out that it would also be a great tool for customers.

At this point, many thousands of customers have downloaded the simulator. Maybe we've lost a few system sales as a result, but I'm sure that most of those people wouldn't have bought extra hardware just to "play around". Besides, I believe that anything we do to help system administrators to get comfortable with our storage systems is a benefit in the long run, even if we do lose a few sales in the short run.

As far as I know, NetApp is the only storage vendor with anything like this.

November 22, 2006

Why NetApp's Earnings Results Last Quarter Frustrated Me

There's an old saying in computer science:
Fast, cheap, reliable—choose any two.
The point is that these goals are in conflict. Improving one tends to hurt the others. Sometimes a new technology paradigm lets you improve all three at once, but within the new paradigm there will still be a conflict between the three.

There is also a conflict between different goals when you design a business model. In this blog, I'll describe the key conflict in NetApp's business model, and how we've resolved it.

If you round everything to the nearest 5%, here is what NetApp's business model generally looks like. For every dollar of revenue we receive from customers, we spend 40 cents manufacturing the products we ship and another 45 cents running the company, which leaves about 15 cents in operating profit. To put this into business terminology, the "operating stack" looks very roughly like this:
40% COGS (Cost of Goods Sold: components, labor, overhead, etc.)
45% Operating Expenses (sales, marketing, engineering, salaries, etc.)
15% Operating Income (profit from running the business, before taxes)
(For the real numbers, see the transcript of our Q2 earnings call.)

The key conflict is long-term growth versus profits this quarter. To maintain growth, you have to invest for the future. If you want to sell more next year than you are selling today, then you'd better hire more sales people. If you expect to have more customers next year, then you'd better hire more customer service engineers to support them, more manufacturing people to build products, and so on.

But investments increase operating expenses and drive down profits.

Let's focus on hiring new people. Employees usually aren't very productive for their first few months, so at first, hiring new people raises expenses without improving revenue. The faster you grow, the more people you have who aren't yet pulling their own weight. I call this the growth tax. At high growth rates, the penalty can be significant. Suppose it takes people 3 months to get up to speed, and suppose that all operating expenses are proportional to head count. At a growth rate of 35%, our growth rate last quarter, the growth tax is about 3.5% of the overall stack. (Here's my math: (135%^(1/12)^3-1)*45% = 3.5%.)

That might not seem like a big percentage, but I can assure you that stock market analysts focus very closely on profitability. That 3.5% growth tax would reduce an 18.5% profit to 15% which is definitely significant. At 50% growth, based on these same dramatically oversimplified assumptions, the tax would be almost 5%.

[Aside: I enjoy applying engineering-style thinking to the operation of the company as a whole. What are we optimizing? Can we model the result? And so on.]

Summary: To grow you must hire, but hiring drives down profits.

The management team at NetApp has decided to optimize for long-term growth, as opposed to optimizing for short-term profits. We believe that optimizing for profit would actually be damaging to the company, because it would doom us to being the perpetual underdog.

That's why I'm frustrated with our earnings results last quarter (FY2007 Q2). By many measures, Q2 was wonderful. It was our largest quarter ever, with $652 million in revenue. That was a 35% increase over Q2 a year ago, which is an awesome growth rate for a company our size. The profit level was also unusually high. We had a non-GAAP operating profit of 18.2%.

It's that operating profit that frustrated me. We believe that the optimal trade-off between growth and profitability occurs at roughly 16% operating profit. (The range we target is 15.8% to 16.4%.) Generating higher profit meant that we invested less. At $652 million in revenue, that 2.2% difference comes to almost $15 million that we could have invested in future growth but did not.

At our annual analyst day conference a couple of years ago, we were describing our business model to investors, and we got an interesting question. Laura Conigliaro, an analyst at Goldman Sachs, said, "If you are targeting a 16% profit model so that you can invest in the future, does that mean it would be bad news if your profit level were to rise? Would that mean that you had run out of new ideas to invest in, and that growth was going to slow?"

We're certainly not out of ideas. In the earnings call, Dan summarized it like this: "We missed an opportunity to get more aggressive this past quarter, and I'd like to see that not happen again."

November 14, 2006

Follow-up on VMware: Both Better and Worse Than I Described

This is a follow-up to my recent blog on How VMware is Revolutionizing Data Centers. Two weeks ago at Oracle OpenWorld, I talked with one customer who detailed exactly what he's accomplished so far with VMware. A second customer explained why VMware won't help his application at all.

The first customer—who asked me to describe his company as "a large UK based Telco"—used VMware to consolidate 502 windows systems onto 25 blades. He freed 173 racks worth of space. He cut power by almost 450 KW per month and reduced his power bill by $50,000 per month, not to mention reduced service and support. In all, he expects to clear 345 racks and replace them with 20 racks, with a full return on investment in less than a year.

He runs a variety of applications: some web servers, some internal databases, some billing applications—all sorts of different stuff, mostly on Windows. On average, the systems were very lightly loaded, many at 5-10% CPU utilization or less. Some of his test and development environments run as many as 50 virtual machines on a single windows server. Production environments may run 5 or less. It all averages out to about 20. What holds the consolidation ratio down in production environments is fear of how many users a failure would affect. He expects the ratio to go up as VMware business continuance tools mature and as application administrators gain confidence.

That is the success story. The other customer explained why VMware won't help him at all. He runs a large internet site with hundreds of web servers—lots of Linux and Apache. He has a load balancing methodology that lets him saturate his servers. (In his case, the systems are actually memory limited, because they cache the most commonly used data. In other environments, CPU is the limiting factor.) He argues—correctly in my view—that VMware wouldn't help him consolidate at all, because he has no spare capacity. VMware's management capabilities could be of some use, but he already solved those problems, so he isn't looking for any help there. (Remember Tom Mendoza's rule of sales: "Customers don't open their wallets unless they are in pain." Wise salesmen save their own time, as well as their customer's.)

Server virtualization is like thin provisioning. Both allow you to hand out resources that you don't really have. With one, you hand out ten one-terabyte LUNs, even though you don't have that much real disk space. With the other, you hand out ten virtual servers even though you don't have ten real servers. (See this blog entry where I argue that thin provisioning is like writing bad checks.) Both tools help you drive up utilization, but that's only useful if utilization was previously low! If utilization was already high, there is no gain to be had. If your users immediately fill their LUNs, then you'd better have the real storage. If your applications peg their servers, then you'd better have real servers.

Thin provisioning, like VMware, performs especially well in data centers with many lightly loaded windows servers. Admins often have no idea how much storage these low-utilization apps really need, so to avoid the hassle of expanding capacity later, they request plenty extra. This leads to wasted space and low utilization. The reason customers keep talking to me about VMware is that they keep noticing the synergy between server consolidation and storage consolidation.

Here's an interesting postscript on the "large web site" customer. Two weeks ago he told me about the application that VMware won't help, but last week at a second meeting, I learned that he also has hundreds of lightly loaded Windows servers running random internal business apps. He is just starting to look into server consolidation. Ironic. My prototypical example of a customer that VMware won't help may soon be running it in a different part of his data center.

November 09, 2006

Data and Ethics (Who Owns My Medical Records?)

Yesterday I testified to the FTC (Federal Trade Commission) as part of a panel on the effect of technology on consumers. My talk was basically a summary of this editorial that I wrote for the Financial Times.

An interesting topic during the open discussion was this: Who owns data about you? Perhaps more importantly, who should own that data? Should Amazon own the record of all the books you have ever bought, or should you? Medical records are even more personal. Should your doctor own your medical records, or should you?

Professor Deirdre Mulligan (at U.C Berkeley, Boalt Hall School of Law) was also on the panel, and I thought she had the deepest insights on the legal and policy issues around personal data. She argued that ownership is the wrong mental model. I certainly have a strong interest in my own medical records, but my doctor has an equally strong interest in the collection of medical records that he has created over the years. From his point of view, it is the record of his career. In the case of a malpractice suit, the records may be required to demonstrate his competence. In some cases, a medical center as a whole comes under regulatory scrutiny, and the records of all doctors and patients at the center may be required to understand the patterns of care.

Mulligan argued that instead of ownership, it is better to think in terms of rights and responsibilities associated with the data. As a patient, I have many rights with respect to my own medical records. I can get access to them, transfer them to a new doctor, and so on. My doctor and the medical center have rights as well, but also responsibilities. They are required to keep the records safe and private. If they are electronic, then HIPAA regulations require them to keep that data for the rest of my life. (You can be sure that the storage implications of these regulations are not lost on us.)

I used medical records as an example, but financial records have the same issues. I think they are my records, but my bank and my broker have lots of reasonable reasons—including legal reasons—to think they are their records.

In summary, if I heard Mulligan's arguments correctly, she was saying that ownership just isn't the right model for thinking about this fuzzy, blended combination of rights and responsibilities that are shared between my doctor, my medical center, my bank and me.

On the other hand, Mulligan used quite a different line of thinking for data that I create that nobody else has any rights to—like my personal calendar or my diary. She argued that it is a very odd artifact of our legal system that my rights are dramatically different depending on whether I store my calendar on my own PC or on a remote server at Yahoo! or Google. If the data is on my own PC, in my own possession, then it belongs to me just as if it were on paper. If the government wants to look at my PC, they can subpoena me, and maybe they'll win, but they certainly can't see it without my knowledge. On the other hand, if my personal calendar is stored on a remote server, the government can read it without me finding out until much later, if ever. In this case, the concept of ownership does seem more appropriate; I created the data by myself, for myself. Unfortunately, the distinction that the legal system makes based on where data is stored, doesn't match people's intuition about data that they think they own no matter where they keep it.

What's exciting to me about all of this is that it dramatically broadens the scope for NetApp. Increasingly the data we store is information about people, or information that consumers believe belongs to them personally. As a result, data management is becoming entangled with philosophical issues of ownership, rights, responsibilities and ethics. None of the technical issues of data management go away, but to really help our customers solve their problems, we also have to focus on "data rights and responsibilities management". Maybe "data stewardship" is the best term to capture this idea. I've always loved working on hard technical problems, but it's especially rewarding to me that our work also matters at a higher societal level.

[Note: The panel discussion jumped around, and the sections I'm describing were quite short. As a result, this note is partly what Prof. Mulligan actually said, but largely my attempt to understand and flesh out the points I thought she was making. If I've screwed it up, that's my fault, not hers.]

November 03, 2006

Oracle and Red Hat

I was in the audience when Larry Ellison announced that Oracle will be offering enterprise-class support for Red Hat Linux. Oracle, of course, has made a big bet on Linux as the preferred platform for running Oracle. (Oracle's big data center in Austin has over 20,000 nodes of Linux. (It's also the single largest NetApp installation in the world.))

Larry argued that three key issues are slowing Linux adoption:
  1. No "true enterprise support" available.
  2. Support costs too much.
  3. Threat of lawsuits from SCO.
Larry's solution: (1) Oracle will support Red Hat Linux, (2) at much lower prices than Red Hat, plus (3) they'll indemnify customers against SCO lawsuits.

This clearly makes strategic sense for Oracle. When NetApp was small, one of our biggest challenges was convincing customers—especially large customers—that we could support them effectively. Multi-billion dollar companies like to do business with multi-billion dollar companies. Red Hat is still under half a billion in revenue. I know that support from Oracle will feel safer, especially for giant customers running their business on Oracle, so I think Larry is right that this will speed Linux adoption.

It also gives Oracle more complete control of the software stack all the way down to the hardware. One audience member asked, "Should we expect to get a complete stack from Oracle now, from the operating system all the way up through applications?" Larry's answer: "Absolutely."

To avoid confusing the Linux market by fragmenting the code base, Oracle will patch Red Hat code, but will resync after major Red Hat releases. I saw a potential tension here: Oracle will depend on Red Hat for distributions, but at the same time they are undercutting Red Hat's prices, which could obviously hurt them. So I asked, "What happens to Red Hat? Is killing them an unintended side effect, or do you have a plan to help keep them alive?" In retrospect, it was a stupid question, because Oracle can always just buy Red Hat if they want to keep them alive.

Larry's answer was interesting: "This is capitalism. We're competing. We're trying to offer a better product at a lower price." On the other hand, he also said, "I don't think that Red Hat is going to be killed. I expect that Red Hat is going to compete very, very aggressively." He was clear that his real goal was to make Linux better, which helps Oracle because they've bet heavily on Linux. Red Hat's response is that they will indeed compete aggressively.

From my perspective, the most interesting audience question was this: "What about incorporating some storage systems, like Network Appliance, maybe?"

Larry's response was: "Well, the great... uh, you know, there's always next year." He kind of stumbled with the response, which gave me the sense that it wasn't something he had thought much about. I found that reassuring.

Recent Posts



Subscribe to Dave's Blog

RSS 2.0
Atom
© NetApp, Inc.  |  "Safe Harbor" Statement