This past week a customer gave a presentation to a large audience of NetApp's senior engineers, and commented that his NFS workload (the distribution of NFS operations) differed markedly from the SPEC SFS 3.0 workload. This going to be case for nearly 100% of all NFS users in the world. A SPEC SFS workload merely reflects the averaging of the real world data that was available at the time the benchmark was developed.
I noted in a previous blog post that the SPEC SFS 2008 workload updates the SPEC SFS 3.0 workload, using an averaging of real world data from customers as gathering by NetApp and other companies on the SPEC SFS subcommittee.
Here is a comparison of the SPEC 3.0 workload, the customer's workload, and the SPEC SFS 2008 workload. I've rounded the customer's numbers to the nearest percentage point, so they may not add up to 100%, nor are the all the operations he quoted part of the SPEC SFS benchmark.
| NFSv3 Operation | SPEC SFS 3.0 | Customer | SPEC SFS 2008 |
| LOOKUP | 27% | 9% | 24% |
| READ | 18% | 4% | 18% |
| WRITE | 9% | 3% | 10% |
| GETATTR | 11% | 64% | 26% |
| READLINK | 7% | <0% | 1% |
| READDIR | 2% | 1% | 1% |
| CREATE | 1% | <1% | 1% |
| REMOVE | 1% | <1% | 1% |
| FSSTAT | 1% | 5% | 1% |
| SETATTR | 1% | 1% | 4% |
| READDIRPLUS | 9% | 1% | 2% |
| ACCESS | 7% | 12% | 11% |
| COMMIT | 5% | 0% | N/A |
It is worthwhile discussing the rows in the table I've highlighted that show either great change between the old and new benchmarks, or customer variance from the new benchmark.
I'm going to look at GETATTR first because it has the most change between the two benchmarks and the most variance with the customer. Why do NFS clients send GETATTRs? It is because the client uses GETATTR as a way to revalidate caches (file contents and directory contents). It is also because software builds, which remain a popular application for NFS, and certainly this customer, use the stat() system call to decided whether a component needs to be re-compiled or re-linked by looking at the dependencies. The GETATTR response contains the updated modification time, and this is used to decide whether the cache is valid or the dependency has changed. What has happened in the 15 years since SPEC SFS 3.0 appeared is that clients have gotten better at caching, thus reflecting the greater importance of GETATTR in NFS. Indeed, it was this recognition of how GETATTR was involved in caching that motivated NFSv4 which has delegations which when in force, eliminate the need for cache validation.
But this customer has lots more GETATTR than the benchmark. This is either because he's done a great job at making sure his working sets fit in his NFS clients' DRAM or because he does a lot of incremental software builds (builds where the developer has previously done a full build and the developer has made a minor change, and is rebuilding). Or both apply. Indeed, NetApp has seen customers with perfect tuning of the caches and working set on their NFS clients, and the FAS server reports 100% GETATTRs. Unfortunately in many cases this drives FAS server CPU utilization to 100%. Such customers often look at our FlexCache product for relief, but I think NFSv4 delegations are a great way to go since they will result in zero GETATTRs being emitted from NFSv4 clients that hold delegations.
LOOKUP: as you can see, the new benchmark somewhat, but not dramatically reduces LOOKUP. Whereas the customer has dramatically fewer LOOKUPs. I suspect this reflects both a customer tuning his name lookup caches on his NFS clients, and very high percentage of GETATTRs.
READ: The customer's variance from SPEC SFS 2008 is more evidence that the customer's working sets are highly cached and/or software builds are mostly incremental.
WRITE: The customer's variance from SPEC SFS 2008 supports the incremental software build theory.
READLINK: The variance between the benchmarks is very interesting. This suggests either that customers are using fewer symbolic links, or symbolic link caching together with directory caching are getting much better in NFS clients.
FSSTAT: This really shocked the customer's audience. Really, we thought we'd seen it all when it comes to a specific customer's NFS workload, and this was a first. The customer said this was an artifact of the automounter they are using. In talking to other NFS engineers, our suspicion is that this is really FSINFO, which is typically invoked as part of mounting NFS file systems. If so this suggests a couple things. First, NFS file systems are being frequently mounted and unmounted. Aside from the stress this can put on the MOUNT service, on many NFS clients unmounting, even attempts to unmount, cause cached data to be tossed. My blog post on automounter tuning explains this. Second, this is an opportunity for automounter implementations to cache results of FSINFO, which typically never change. In other words, if an automounter is re-mounting the same path name from an NFS server, it probably doesn't need to re-send FSINFO every time, which would buy this customer as much as 5% additional FAS server performance.
SETATTR: It is curious that SPEC SFS 2008 is increasing SETATTR four fold (unlike the customer). Given the dominance of the Linux NFS client today, that's an area that needs to be investigated. I can't begin to guess why collectively the NFS vendor community is seeing higher SETATTRs.
READDIRPPLUS: the dramatic drop in SPEC SFS 2008 versus SPEC SFS 3.0 is a reflection of the excellent work client implementers have done in directory caching. Note that NFSv4.1 will specify directory delegations, and so READDIRPLUS and GETATTR counts will be reduced even further once NFSv4.1 is commonly used.
ACCESS: I suspect the change in SPEC SFS 2008 mostly reflects the shift downward in other metadata operations, though I worry about whether NFS clients are doing a good job at caching results from ACCESS. Trond Myklebust tells me that in Linux 2.6.5 and earlier, at most one ACCESS result could be cached per file. In later kernels, every result will be cached, with least recently used results evicted when there is memory pressure (which is how Solaris has been doing it).
COMMIT: COMMIT is not part of the new benchmark. As SPEC says:
Also note that COMMITs are no longer included in the op mix and are not counted as completed operations in the benchmark result. When required due to UNSTABLE write responses from the server, COMMITs will be issued and the time required to complete the COMMIT will be included in the response time measurement for the logical write which required it. In effect, COMMIT operations are ‘overhead’ for which no credit is given in situations when they are required by the nature of the server response.
At any rate, NetApp FAS servers will usually report zero COMMITs because all our responses to the WRITE operation will indicate the WRITE was saved to stable storage regardless what the NFS client requested. And NetApp FAS servers always write to stable storage because every modifying operation is logged to NVRAM in case the FAS server restarts before the operation is made consistent in the WAFL file system via a consistency point.
While I'd like to see the SFS workload continue to evolve to match the mean average of all sites that use NFS, readers are cautioned that SFS is just a tool. Your real workload won't ever match the benchmark. There's no substitute for analyzing real situations, which often leads to questions and answers that result in a better environment.

From your posting:
"Indeed, NetApp has seen customers with perfect tuning of the caches and working set on their NFS clients, and the FAS server reports 100% GETATTRs. Unfortunately in many cases this drives FAS server CPU utilization to 100%."
You do NOT need perfect client side caching
to see very high, aka near exhaustion, of
the cpu on your NetApp filer. At many large
sites this is the dominant op as per a
tcpdump of the wire. Hence one suspects
that NetApp's software appears to be
somewhat single-threaded with respect
to GETATTR operations.
Care to comment on your implementation
of GETATTR? We have heard that only
ONE cpu handles GETATTR operations.
Posted by: Sterling Marshall | June 11, 2008 at 02:55 PM