« Think of a Will as a Program You Can Only Test By Dying | Main | What Killed The Storage Service Providers? »

August 21, 2007

Oracle Optimizes Its Database for NFS

NFS has become critical to data center grid environments. As a result, Oracle has optimized its code specifically for NFS. Instead of relying on the operating system, Oracle’s Direct NFS Client generates NFS requests directly from the database.

Direct NFS was inspired by experience at Oracle’s Austin Data Center. Oracle uses NFS to run its applications on tens of thousands of Linux servers accessing many petabytes of NetApp storage. In 2005 they had 12,000 Linux servers and 3 petabytes of NetApp storage. Today’s numbers aren’t public, but they are much larger.

When an operating system capability becomes sufficiently important, Oracle pulls it into the database. Memory management became critical, so Oracle said, “Just give me the raw pages, and I’ll manage them myself.” Disk caching became critical, and Oracle said, “Just give me the raw disk blocks, and I’ll cache them myself.” Now NFS has become critical, so Oracle says, “Just give me a raw TCP/IP socket, and I’ll generate NFS requests myself.” 

Steve Kleiman has argued that as Oracle becomes more sophisticated, the operating system becomes little more than a device driver framework that gives the database raw access to the hardware. That sheds new light on Oracle’s Unbreakable Linux program.

What exactly does Oracle gain from Direct NFS? The primary benefits are simplicity and performance. 

It’s simpler because you don’t have to worry about how to configure NFS. What timeouts should you use? What caching options? It doesn’t matter. Oracle looks at how you have NFS configured to figure out where the data lives, but aside from that, your settings don’t matter. Oracle takes control.

It even works with Windows. Just mount the data that Oracle needs using a CIFS share, and Oracle figures out the location of the data and accesses it via NFS. (CIFS is great for home directory sharing, but it isn’t designed for database workloads.) 

Performance is better because Oracle bypasses the operating system and generates exactly the requests it needs. Data is cached just once, in user space, which saves memory – no second copy in kernel space. Oracle also improves performance by load balancing across multiple network interfaces, if they are available.

For more technical details on Direct NFS, check out this article by Kevin Closson. He works for PolyServe, which is a NetApp competitor, but technically speaking, he talks good sense. I also recommend this article, by NetApp’s John Elliott, comparing Oracle performance over Fibre Channel, NFS and iSCSI. 

NetApp has been closely involved in Direct NFS from the very beginning. Peter Schay came up with the idea while he worked for Oracle’s “Linux Program Office”. He wanted to simplify things for Oracle customers running on Linux, many of whom were hosted on Oracle’s On-Demand environment at the Austin Data Center. He worked closely with NetApp engineers to prototype and test the idea. The Oracle ST team used his functional specification to develop the production version of Direct NFS now shipping in 11g. (Today Peter works for NetApp.)

I love how NFS has evolved over the past couple of decades. Twenty years ago, it providing file sharing to small engineering workgroups; today it provides the data backbone for some of the world’s largest data centers. What it is about NFS that has allowed it to make this transition? What is it about NFS that Oracle would choose to build it directly into their database? That’s the topic for another post!

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2345678/20989739

Listed below are links to weblogs that reference Oracle Optimizes Its Database for NFS:

Comments

Dave,

You're a touch of class! I appreciate you jumping across "competitive boundries" like you did with your reference to one of my blog entries.

As a side note, I'm not PolyServe these days, since HP bought us. And, in fact, I'm not HP much longer either as I'll be taking a role in the Oracle Server Technologies Group after Labor Day weekend.

By the way, say "Hi" to Pete for me. I wondered where he landed...

You've been able to do direct IO on NFS for a while now (even 2.4 kernels), so with an ordinary NFS mount you could do IO directly to userspace and do all the caching there. So the caching thing is irrelevant (perhaps not so for Windows platforms...)

Direct NFS Client doesnt seem to bypass the OS completely, because it still needs to use the OS TCP stack. In any case, it looks like it does bypass the Linux NFS and RPC layers in the client(..underneath the VFS..) and thats a major gain. For example, see the NFS client perf figures at:

http://gelato.unsw.edu.au/IA64wiki/NFSPerformance

Shehir,

DNFS eliminates the overhead of entering the kernel with libC or libaio calls that must vector to RPC via the VFS layering. In short, DNFS is RPC. Oracle generates and tracks their own XIDs and just shoots RPC straight from the server. You can see the overhead reduction if you go to the URL Dave provided to my site and get the paper I wrote with Oracle on the matter.

After reading up on this on technet.oracle.com last month, I was surprised not to see something from NetApp (only an HP press release seemed to mention it) to this exciting development until now.

Oracle's benchmarks seemed a bit misguided (focusing far too heavily on interface load balancing instead of the real meat of the technology); how soon could we hope to see a NetApp TR to dive into this? In particular it'd be interesting to see how this skews the results from TR3496.

Though certainly not NetApp's problem, it was disappointing to see this not done under NFSv4. Particularly in a RAC environment one would think directly exposing the database to v4 delegations could be huge...

Folks,

The last post by "Kevin" (8/30/07 12:46PM) wasn't me. I've gotten a barrage of email from folks asking me about NFSv4. To be perfectly honest, I'm not expecting much gain out of NFSv4 vis a vis Oracle throughput. Folks, Oracle is a seek, read/write workload. It all comes down to payload on the wire...pure grunt work. It is quite simple to get Oracle driving I/O at GbE wire capacity. NFSv4 can't make the wire fatter.


I'm using 2 NIC's, one contected to the ethernet for aplications requests, and other contected directly to the SAN switch just for mounting the database file systems.
The performance is great!!!

Deals Unlimited is an online mobile comparison uk portal offering free Contract Mobile Phone deals of Nokia, Samsung, LG, Motorala, Sony ericsson with 3 Mobiles, Orange,T, O2, Vodafone and Virgin Mobile Phones. Buy cheap & latest mobile phones online.

The comments to this entry are closed.

Subscribe to This Blog




© NetApp, Inc.  |  "Safe Harbor" Statement