« September 2006 | Main | November 2006 »

October 2006

October 30, 2006

Booth Duty at Oracle Open World: FlexClone is the Big Hit

I spent several hours at the NetApp booth at Oracle OpenWorld this week, so I got to talk with lots of Oracle database administrators. For DBAs, FlexClone is the feature that attracted the most attention, especially for test and development environments. When we started the show, we had six different stations showing six different features. By the end, we had four of the six demoing FlexClone.

A FlexClone is a virtual copy. It looks exactly like a real copy, except that it doesn't consume any disk space. The clone uses magic with pointers so that the "copy" refers to all the same blocks on disk as the original. Of course, if you write to a clone, then it does use extra space. The write has to go to a new location on disk to avoid changing the original.

In test and dev, cloning is useful for two reasons.

Reason #1: Physical storage is expensive. If you have a 1 TB database, creating ten copies will cost you 10 TB. Creating ten clones takes no space at all. Of course, writing to the clones will consume space, but most tests overwrite only a small percentage of the database. With clones, you can create more copies, and run more tests in parallel, than if you had to pay for the storage.

I'm talking like the benefit is that you can create more copies, but the reality is that many people run their tests against small artificial data sets. You could argue that the real benefit is not that you get more copies, but that you can run all of your tests on full copies of the live data.

Reason #2: It's fast and easy to return to a known good state. Because no data is copied, creating a clone takes almost no time at all, even for a very large database. This makes it easy to run a test, check the result, make a small change, and then run the test again on the exact same data. Just run each test on a new clone, and blow away the clone when the test is done.

The folks at Oracle University use cloning for their classes. They have 500 classes per week, with a total of 6,500 students, and they have lots of hands-on labs. Before they started using NetApp clones, it took 36 hours to build all of the database instances for the week. With FlexClone, they reduced that to one hour. They just delete the used clones from last week and create new clones for next week. You can imagine a massively parallel test environment using the same approach.

People also use FlexClone in production. There is always some risk when you modify a production environment. It's safer to create a clone and make changes there. When you are convinced that everything worked, then you can put the clone into production. If something went wrong, delete the clone and try again.

I'm sorry if this reads more like a NetApp advertisement than most of my blogs. I'm just telling you, after talking with people all afternoon at our booth, this is the thing that really got them excited.

If you want more details, check out this paper on IBM's web site. (Now that IBM is a NetApp OEM, they put whitepapers about us on their website.) In the paper, they do all the steps manually, but we have a tool called SnapManager for Oracle that automates everything for DBAs who would rather not mess with the storage system themselves.

October 20, 2006

How VMware is Revolutionizing Data Centers

VMware may be the acquisition of the decade. Brilliant move by EMC.

From the beginning I understood that VMware had really cool technology, but it took me longer to understand the potential that VMware has to revolutionize big enterprise data centers.

My initial (naïve) view was that VMware would be a great tool for software engineers. Instead of buying one PC for Windows and a second for Linux, I can create a virtual machine on my PC and run Linux in that. To test my software on different versions of Windows, I create more virtual machines. Cheaper and more convenient than buying lots of PCs.

What I didn't understand at first was why people would deploy VMware as part of a data center compute infrastructure. I have a UNIX background. In that world, you buy a big server and run lots of applications on it—same as the mainframe model. I didn't understand the value that VMware would bring in running data center apps.

What I wasn't taking into account is that people only run one application at a time on Windows. In theory you could run an instance of SQL Server and an instance of Exchange on the same system, but in reality people have discovered that Windows crashes less if you only run one app. You can install just the right patch levels, and you don't have to worry about bad registry interactions. Management tools work better too. Figure out the exact right configuration for each application and stick with it. Some people call their best-practice configuration the gold disk. If you want another SQL Server, buy another system and copy the gold disk onto it.

Given my UNIX background, this way of thinking wasn't intuitive to me, but it's the reality in the Windows world. I am the executive sponsor for a large telephone company and they have hundreds of Wintel servers, each running a single application, many of them using less than 5 or 10 percent of the CPU. Their plan is to use VMware to consolidate this by at least a factor of 10, maybe even a factor of 20, depending on how the pilot program goes. This will fundamentally change the economics of their data center. Less floor space, less power and cooling, easier management, and—of course—much less money buying servers.

Going forward, VMware offers even more interesting visions: rapid provisioning of new applications, transparent migration, virtual data centers that provide DR protection for multiple physical data centers. But that's not what I see people excited about today. Today's excitement comes entirely from how much you save when you reduce servers by a factor of 5 or 10 or even 20.

VMware-based innovation in the compute infrastructure creates a parallel opportunity to innovate in the storage infrastructure. In fact, a solid data management strategy is a key success factor for large server consolidations. The requirements align nicely with NetApp's strengths.

For one thing, the goal is to save money, which often pushes people to Ethernet attached storage—either iSCSI or NAS—where NetApp is the market leader. A few apps may require Fibre Channel performance, so storage that supports both Ethernet and Fibre Channel is perfect.

Our data management capabilities fit nicely as well. For instance, cloning is valuable because it lets you make many copies of the gold disk without consuming any additional physical storage space. With thin provisioning, supplying storage to a virtual machine is just as simple as creating the virtual machine. I could go on, but if you want details, you'd be better off reading this paper.

October 13, 2006

How The Speed of My Eyeball Affects Computer Design

There is an interesting relationship between the bandwidth of the human optic nerve and the evolution of computer architectures.

Early timeshare systems had way too little bandwidth between the computer and the user. Way, way less than the bandwidth of my eye. As a kid, I remember playing the computer game Star Trek on a 300 baud modem. The "long range scan" was painfully slow, even though all it printed was a 10 by 10 array of letters showing the current sector in space. (E is for Enterprise; K is for Klingon.) Later, when I got my first programming job, my terminal was 1200 baud—only senior engineers rated 2400 or 4800 baud terminals. Still painfully slow.

Networks and terminal cables were so slow that the only way to get quality graphics was to put the CPU right in front of the user's face—hence the invention of the workstation. That was a big improvement, but computer graphics were still much slower and blockier than real life. Computer games provide good intuition about this. Think back to the flight simulators of ten years ago. Today, CPUs and graphics cards have improved so much that improvements are marginal. The latest generation of games has slightly better flames and smoke, and I can see individual hairs on the characters, but it doesn't really make a difference.

Once the CPU and graphics card have bandwidth that matches my optic nerve, further improvements simply don't matter.

Likewise, once the network bandwidth approaches the bandwidth of the optic nerve, there is no longer any reason to keep the CPU close to the user's face. For most applications, we've already reached this point, which is why CPUs are starting to recentralize. One of our customers told me that he is moving most of his users to Citrix thin clients, with the PC itself centralized on blade servers, probably running VMware. He won't be moving his power users at first, but as networks get faster, there is no reason he shouldn't.

It is interesting to consider Moore's law in the context of fundamental human limits like optic nerve bandwidth. I think everyone understands that CPU performance is getting less and less important for most applications. That's why I care more about weight and battery life in my new laptop computer than I do about CPU speed. But as computer and networking performance passes the physiological limits of people, it becomes clear that applications where performance matters will become more and more esoteric.

I certainly don't mean to suggest that it's time to stop making faster computers and networks. Plenty of scientific and corporate applications need more speed. However, considering human limits, we can see that for the majority of applications, it's going to be more important to focus our innovation on cost and convenience.

October 06, 2006

What's Up With NearStore?

I'm on a 747 flying home from Salzburg, Austria where we hosted executive forums with customers, resellers and distributors. One customer asked, "What's up with your NearStore products? Isn't the NearStore R200 getting a little long in the tooth?"

For those who don't have our product line memorized, the NearStore R200 is a storage system based on slow and cheap ATA disks that are less reliable than the Fibre Channel drives we use for primary storage. The system is optimized for data retention and data protection. We call it near-line storage because it is faster and more reliable than off-line storage like tapes, but much less expensive than primary storage. Less expensive, and nearly as good. We introduced this concept several years ago, and it has been wildly successful. Last quarter, 60% of the terabytes we shipped were ATA drives. (See this blog post for an explanation of why you really need a double protecting RAID, aka RAID 6, to make ATA drives safe to use.)

Some people do use NearStore as primary storage for less critical applications, just to save money, but the more common use is to hold secondary copies of data—data that is also stored elsewhere on a primary storage system. These secondary copies can be used for D2D2T (disk-to-disk-to-tape backup), long-term archiving, and regulatory compliance.

So why haven't we introduced an R300? The reason is that our current generation of storage platforms supports both ATA and Fibre Channel. If you want near-line storage on the latest and greatest platforms, just buy a FAS3000 or a FAS6000 series and order it with ATA drives. (We call this NearStore on FAS.)

In addition, when we launched our VTL (Virtual Tape Library) product, we called it NearStore VTL, because disk-based virtual tape is another case where you want near-line storage that is better than tape but not as expensive as primary storage.

I'd like to share an argument that I had years ago with Jerry Lopatin, who was the driving force behind our first NearStore products. (Jerry has since left NetApp and runs engineering and manufacturing at ONStor.)

My view was that NearStore was all about making storage cheaper, since ATA drives are cheaper than Fibre Channel. Jerry disagreed. He felt that, in the long run, the software that we developed to use these inexpensive storage systems would be much more interesting. He argued that if we could drive the cost of disk-based storage down, we would have the opportunity to create whole new ways of thinking about data protection and data retention based on disks instead of tape, and—in the end—the software we developed to do this would be what mattered most.

Today we have several families of software that use disks in new ways, including data protection, data archiving, and regulatory compliance. As I said above, many people do use ATA-based systems for primary storage, and I expect that will become more and more common over time, but what's much more exciting to me is that—along with EMC's Centera product—we are helping to create a whole new market based on clever ways of managing secondary copies of storage.

Summary: Jerry was right; I was wrong.

Recent Posts



Subscribe to Dave's Blog

RSS 2.0
Atom
© NetApp, Inc.  |  "Safe Harbor" Statement