I've written a couple of Internet-Drafts (documents that are not RFCs, but eventually might end up as RFCs) that propose some ideas for de-duplication and metadata striping.
What does NFS have to do with de-duplication? After, it isn't as if a client is affected if server's storage array finds that 90% or so of its storage is redundant and acts accordingly. The answer is that if an NFS client is caching files, knowing that two files being cached have a block of data in common accomplishes two things:
- Only one block has to be cached
- Only one NFS READ request has to be sent
Thus just as de-duplication provides greater efficiencies resulting in lower requirements on physical space and energy, allowing NFS clients to be aware of de-duplication provides greater efficiencies in terms of better utilization of memory and network links (potentially reducing capital costs). The classic use case for de-duplication awareness is a hypervisor. A hypervisor that is switching among 100s of guest operating systems, each cloned from the same template operating system install image obviously has a very significant de-duplication factor (the percentage of data that is common among al the guests). I am aware that at least one hypervisor de-duplicates its cache by scanning the cache for duplicate blocks, and de-allocating redundant data. This does provide better cache utilization but at the cost of sending redundant READ operations.
The logic for metadata striping follows from he logic for data striping as introduced by pNFS. The pNFS protocol that is pending at IETF today only specifies data striping. Metadata striping provides two benefits:
- Greater efficiencies due to spreading metadata like directories across several storage nodes.
- Reduced latency by telling NFS clients where to send metadata operations like LOOKUP, OPEN, CREATE, and READDIR, versus having a node in a cluster forward the operation to another node. As with pNFS for data striping, moving the switch to client provides the most benefit.
With both of these proposals, the part I am pleased about is that thus far I see no reason to revise the NFSv4 protocol to support either metadata striping or de-duplication awareness. Instead both proposals use the pNFS framework. The pNFS protocol has the concept of a "layout" as its foundation. The three types of layouts the NFSv4 working group is standardizing are striping patterns for storage clusters accessed via NFSv4.1, SCSI (iSCSI and FC), and OSD. The NFsv4.1 protocol allows additional layout types to be specified. The metadata and de-duplication proposals are expressed as new layout types.

Comments