I had to chuckle when I saw the concurrent announcements from Permabit and Storage Switzerland that dedupe 2.0 had arrived. Apparently the acquisition of Data Domain by EMC had signaled the demise of dedupe 1.0, even though I failed to see any technology transformation as a result of that merger. Data Domain is still doing the same thing they always did and EMC is still doing what they do. There was no transformative event here guys.
Anyway, the people at Permabit seem to believe that hashing and deduplicating large archival stores is ushering a new era of deduplication and leaving point products behind. But, er, isn't Permabit another point product? Last time I looked in a data center there was more than archival data behind the blinking lights on those storage arrays. In my estimation, deep archival consumes somewhere around 1% of the world's data storage.
Then there's Storage Switzerland. I guess the Switzerland part is to have one believe that they are a neutral party amidst this vendor-against-vendor world. But does anyone out there really think they are neutral? I'll let my readers form their own conclusions. Anyway, SS says that the core of dedupe 2.0 will be a foundational repository where all previously optimized data will come to rest from primary and secondary storage. Hmmm, sounds strangely like an archival system to me. Wait, doesn't Permabit sell archival systems, and they are also talking about dedupe 2.0? Could there be a connection here? Say it ain't so Storage Switzerland!
OK lets move back towards reality. You know when dedupe 2.0 will really arrive? When people stop talking about dedupe altogether. Like many technologies that came before it, such as RAID, SCSI, Snapshots, and dozens of others, dedupe will become ubiquitous and acknowledged as a must-have feature from every serious storage vendor. Inline or Post-process, source or target, local or global - it doesn't matter as long as it dedupes. Dedupe will run silently in the background, eradicating those pesky duplicate data objects. NetApp has proven that dedupe can run quite well on a unified platorm that services all tiers of storage (including archival), uses all standard protocols, and serves a wide breadth of applications. Once a concept is proven, the next steps are constant refinement and optimization, and thats exactly what we are doing. I can't tell you exactly the day dedupe 2.0 will arrive, but I am pretty sure that NetApp will be among the frontrunners.
DrDedupe
