« pNFS in the News | Main | Blog Post Series: Since NFSv4 is Stateful It Must Be Less Robust, Right? »

August 14, 2008

Part III: Since NFSv4 is Stateful It Must Be Less Robust, Right?

This should conclude my series on this topic, but obviously it's my blog and like any content provider with no self-respect, I am free to make as many sequels as I want in order to milk the topic for all it is worth.

I am going to compare the impact on applications when the NFS client it is using recovers from an NFSv3 and NFSv4 server restart.

An application has open files. Some of these files might have byte range locks. When the application runs over an NFSv3 file system, the NFSv3 client is aware which files have opens and which have locks. The NFSv3 server is not aware which files have opens, but is aware which files have locks. When the NFSv3 server restarts, whether due to a crash, or planned reboot, the NFSv3 client will reclaim its locks (assuming the server successfully notifies the client; this is not guaranteed, even by Data ONTAP, though in Data ONAP 7.2.1 and beyond, a much better effort at notification is made). The client, as well as other NFSv3 clients, will be prevented from acquiring new locks, whether on the same set of files it had locks for, or different files until the NFSv3 server's grace period (45 seconds by default in Data ONTAP) expires.

Now lets examine the NFS4 experience. The same application has open files. Some of these files might have byte range locks. When the application runs over an NFSv4 file system, the NFSv4 client is aware which files have opens and which have locks. The NFSv4 server is also aware which files have opens which files have locks. When the NFSv4 server restarts, whether due to a crash, or planned reboot, the NFSv4 client will reclaim its opens and locks (assuming the client conforms to the NFSv4 protocol, this is guaranteed). The client, as well as other NFSv4 clients, will be prevented from acquiring new opens and locks, whether on the same set of files it had opens and locks for, or different files until the NFSv4 server's grace period (45 seconds by default in Data ONTAP) expires.

What if an NFS client fails to reclaim its locks before the grace period expires? Regardless whether this is NFSv3 or NFSv4, if another client holds a conflicting lock, then the reclaim is denied.

 

But wait, NFSv4 now has open state. Does this mean that because NFSv3 did not have open state, and NFSv4 does, failure to reclaim an open before the grace period represents a regression for the application: won't the application experience a failure that it would not have had with NFSv3?

The short answer is no, there is no regression in robustness. But the answer requires some longer explanation of how NFSv4 OPEN operations work.

The OPEN operation allows a client to open a file for read, write, or read-write (mirroring what operating systems like Linux and UNIX support in their open() APIs). The operation also allows a client specify a deny mode: read, write, read-write, or none. The concept of deny modes comes from Windows, and was included in NFSv4 in order to better support Windows semantics. UNIX-based NFSv4 clients will typically send OPEN operations with a deny mode of none, which means that client does not want to prevent another client from opening the same file (assuming the permissions on the file would allow the other client to open). Windows-based NFSv4 clients will typically use the deny mode that was passed from the Windows API for file opens. A deny mode of write says to deny any client that wants to open the file for write or read-write. A deny mode of read says to deny any client that wants to open the file for read or read-write. A deny mode of read-write says to deny any client that wants to open the file.

OK, so this means that technically the use of deny modes could prevent a client from successfully reclaiming an open. The scenario is client A had the file opened with a deny mode of none, and the server restarts. After the grace period, client A has not yet reclaimed its open, and client B opens the file with a deny mode that conflicts with the open client A had. Client A then attempts to reclaim after the grace period and is denied. The next time the application attempts an I/O on the file, the I/O fails.

A regression as compared to NFSv3 right?

No, at least not from the perspective of Data ONTAP. Before NetApp shipped an NFSv4 server, it had, and continues to have, a CIFS server. CIFS of course implements deny modes in its open operation, and thus the only way to use deny modes in Data ONTAP before there was NFSv4 was to use CIFS. Let's repeat the above scenario with client A being an NFSv3 client, and client B being a CIFS client.

NFSv3 Client A had the file opened, and the server restarts (actually it makes no difference if the server restarts, but we are trying to compare apples to apples). Because this is NFSv3, no open reclaim is needed. CIFS client B opens the file with a deny mode that conflicts with the open() system call that was executed on client A.  The next time the application attempts an I/O on the file, the I/O fails.

The path to the end result is different, but the end result is the same. Whether the Windows client is using CIFS or NFSv4, the use of deny modes can cause I/Os for NFSv3 or NFSv4 clients to fail.

But here is the NFSv4 bonus. The above scenario mentions that it makes no difference if the server restarts; the I/O from the NFSv3 client fails. Let's tweak the scenario for NFSv4 so that the server does not restart.

NFSv4 client A has the file opened with a deny mode of none. CIFS Client B (or NFSv4 client B) attempts to opens the file with a deny mode that conflicts with the open client A had. Client B's OPEN fails, because client A already has a conflicting open. The next time the application attempts an I/O on the file, the I/O succeeds.

So I didn't tell the whole truth with my short answer. Not only is there no regression in robustness for applications using NFSv4 instead of NFSv3, applications can see improved robustness.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2345678/32357832

Listed below are links to weblogs that reference Part III: Since NFSv4 is Stateful It Must Be Less Robust, Right?:

Comments

Post a comment

If you have a TypeKey or TypePad account, please Sign In

© NetApp, Inc.  |  "Safe Harbor" Statement