It’s been quite a ride here at NetApp since we launched deduplication 2 years ago. Thought I’d take some time to reflect on what I’ve witnessed in this short period of time. As I said in an earlier blog, no one on the original dedupe launch team (at that time we were still calling it A-SIS) could have envisioned that today we’d be sitting here looking at an installed base of over 37,000 systems. Our customers are running deduplication in virtually all industry segments, across scores of workloads (including primary production workloads), with use cases offering tremendous space savings. There are very few times, if any, in one’s career where something you are responsible for becomes an "overnight" success, but this was one of those cases. Let’s look back and see how this all happened.
NetApp deduplication development began back in 2004; when our technical directors realized that we could create a catalog of “fingerprints,” and use those fingerprints to identify deduplicate data blocks. The rest was easy – once the duplicates were identified, we could use our existing system of multiple block referencing (used in all our snap features) and create many pointers to a single data block, while releasing all the duplicate blocks back to the system for re-use. NetApp deduplication was born.
Over the next year or so, the development work started. The coding was actually pretty quick, what took far greater time was the QA testing. You see, when you store some of the world’s most valuable data, you better be very careful you are not going to break anything with any new feature, especially one that removes blocks of data. One thing I am grateful for was the fact that the early designers insisted that we do a full byte-level comparisons before any deduplication occured, that turned out to be a very wise decision indeed.
Finally, in October of 2006, we were ready for field tests. Luckily, time was on our side – deduplication was still nascent and people weren’t clamoring for it (yet.) So we decided to conduct an “early release” program with 25 of our closest friends, er customers. Everyone on the development team held their collective breath and waited for the results to come in. Were we going to see system crashes? panics? data corruption? As it turns out, over this 6-month period not a single bug report was filed. I was astounded. Having been around the block a few times at other companies I knew how these things usually go. Marketing wants the widget before Engineering is ready to give it up, and there are constant “I told you so” meetings during this rough-and-ready period. But none of that happened in this case – a Marketeer’s dream.
So in March of 2007, we released NetApp deduplication to the world. What I failed to understand in those early days was our customer’s propensity to take risks and ignore our advice. “Just use it for archival data”, we said, figuring that was safe enough – we wanted to walk before we ran. Our customers, however, seemed to be running the 100-yard deduplication dash. They immediately began to use deduplication on user files, virtual machines, and all those other areas we cautioned them against – and on thier primary production volumes no less! During this period, I got the feeling I was raising another teenage son, telling him not to do something and then watching him do exactly what I told him not to do.
Time for my second moment of astonishment. NetApp deduplication worked on these applications. And worked. And worked. To my amazement, I heard over and over again that our customers were reporting significant space saving and no performance impact. So I did what any good marketing man would do. I promoted these use cases, both within NetApp and to our other Users. You should have seen the icy stares I got in engineering meetings. But I persisted and eventually won over the mob - but it really wasn’t me, it was a very well designed feature that I was lucky enough to inherit just at the right time.
Now let’s fast forward to today’s scene. Anyone who does anything in the data storage industry knows that deduplication is hot, white hot. NetApp has proven that deduplication can be used anywhere, including primary production data. Newcomers and established storage vendors today are making a lot of noise trying to grab a chunk of the dedupe pie. Through all this clamoring, lets not forget one fact - NetApp was in the right place at the right time; and changed the way that people view deduplication and storage efficiency. Come to think of it, maybe deduplication really is the teenager of the IT family. Bordering on maturity - but still with plenty of years to grow. Time will tell…but I like our position at the table today.