« Dave Hitz and Brian Pawlowski Interviewed by Harvard Professor Margo Seltzer on the Topic of the Present and Future of SAN/NAS | Main | IETF 72 in Dublin Starts Monday, July 28, 2008 »

July 15, 2008

Part II: Since NFSv4 is Stateful It Must Be Less Robust, Right?

It turns out I'm going to make this at least a three part series because I've received questions about CIFS that warrants separate treatment for this comparison of state recovery in NFSv4 and CIFS

Let's start with a logic error:

Jane's arms are not strong enough to pull her weight. Jane cannot climb ropes. Jill's arms are also not strong enough to pull her weight. Therefore Jill cannot climb ropes either.

Obviously, lack of arm strength could make it harder to climb ropes, but if one understands how many people climb a rope hung from the ceiling of a gymnasium, one knows that arm strength has little to do with it.

Now lets replace Jane with CIFS, Jill with NFSv4, "arms are not strong enough to pull her weight" " with "has state", "climb ropes" with "recover from network or server failure":

CIFS has state. CIFS cannot recover from network or server failure. NFSv4 also has state. Therefore NFSv4 cannot recover from network or server failure either.

The above is false, but throw in the nouns state and recover, and some people cannot immediately see the illogic.

Let's revisit Jane and Jill, and change the passage a bit:

Jane's arms are not strong enough to pull her weight, and she does not wear gym shoes. Jane cannot climb ropes. Jill's arms are also not strong enough to pull her weight, but she wears gym shoes. Jill can climb ropes.

No logic errors this time. Jane is probably trying to pull herself up using just her arms. Like Jane, I could not in junior high school, and still cannot as an adult, vertically pull myself up the full length of 30 foot rope attached to a ceiling. In my gym class there was one kid named Wade who could do it. Wade was heavy into weight training, had arms as thick as the thighs of most men, and could easily lift weights totaling more than his own weight. And by the time he pulled himself up to the top of the Samuel Crowther School's gym ceiling he was red faced and exhausted, and I'm surprised he had any stamina left to safely climb back down. Impressive nonetheless.

But the rest of us 90 pound weaklings managed to climb by using our shoes. Gym shoes have soles with great traction designed for running but also great for trapping the rope between bottom of one shoe and the top of the other. By holding the rope with the hands first, bending the legs to grab and trap the rope between the feet, and unbending the legs, once can inch up the rope with not more more effort needed than to climb stairs. One web site calls this the brake and squat method, http://www.powerathletesmag.com/archives/Girevik/Second/articleropeclimbing.htm . The web site says to wear pants when climbing this way, but Samuel Crowther's gym teachers would send us to detention if we didn't have gym shorts, so I guess we suffered or something.

So let's replace "wear gym shoes" with "possess state recovery":

CIFS has state, and it does not possess state recovery. CIFS cannot recover from network or server failure. NFSv4 also has state, but it does possess state recovery. NFSv4 can recover from network or server failure.

Again no logic errors.

Indeed, CIFS ties its state to the continued existence of the TCP connection between the CIFS client and server. Break the TCP connection, and all the state objects, locks, opens, sessions, etc. disappear. That is zero state recovery. Break the TCP connection in NFS, regardless of whether it is NFSv2, v3, v4, v4.1, ... and the NFS client just re-connects. State is not lost because there is no TCP connection.

So using CIFS as evidence that NFSv4 cannot recover from failure is false logic. Part I of this series explained how NFSv4 state recovery from server failure works. Part III will explore the impact on applications.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341ca27e53ef00e553bc5fad8834

Listed below are links to weblogs that reference Part II: Since NFSv4 is Stateful It Must Be Less Robust, Right?:

Comments

So the moral of the story is .... Albertan school gym teachers are sadistic? :)

Seriously though - this post is actually a great public service for NAS admins familiar only with CIFS.

There is hope for (intelligent) stateful protocols!

Hi,

This is a very interesting topic to me. Thanks for addressing it. I have been following the stateless-or-stateful argument for a while now (see http://osdir.com/ml/web.services.general/2004-11/msg00018.html for a historical summary of the argument).

Forgetting about CIFS and NFS for a moment and looking at the problem conceptually, I'm wondering if the "lease" convention is a suitable compromise that can allow capabilities that require state (i.e., reliable delivery, secure conversations, transactions, etc.) to be implemented in a simple and natural way exploiting "session" semantics without causing massive scalability problems. With explicit leases, only truly active clients need to be supported.

I'd be very interested in hearing your views.

Regards,
Ganesh Prasad

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

© NetApp, Inc.  |  "Safe Harbor" Statement