« The Evolution of the Storage Brain - Applications Run Faster With Deduplication | Main | The Dedupe 2.0 Pundits Are Still Swimming in Lake 1.0 »

September 08, 2009

Comments

Hmmm

The B-5's sang "Love Shack" at MEC (what used to be the Microsoft Exchange Conference) 1999 in Atlanta... They were the entertainment for event night. Is there a deeper trend here?

John

I think the Exchange version of Single Instance Storage means something very different than what you think it means...

It was a cool feature to reduce I/O peaks (write once, instead of 10x), but the "less space" thing was often temporary and usually went away over time anyways as people marked things as read, replied, etc...

"Real world" Exchange SIS space savings is something in the sub 10% neighborhood...

SIS or single instance of storage means just that: an object was stored once and referenced many times within an Exchange store. See: http://support.microsoft.com/kb/175481/

What it means is that, in Exchange 2007 and prior, tables like the message table or attachments tables were global. You used secondary indexes to create things like folder views. Multiple secondary indexes could point to the same entries in a table. If I sent a message to you and you are on the same store, then there's only one copy in the message table and we both have secondary indexes or mailbox views that point to it; mine in my sent items folder and you in your inbox.

Creating those secondary indexes was IO intensive. Flattening the schema, by creating mailbox level tables instead of global or store level tables, reduced the IO dramatically. A consequence of flattening the schema is that SIS is gone.

John Fullbright

Your example isn't the best either:

If I sent an mp3 to 10 people in the same store then it'd be stored once and you'd have a saving through SIS.

However, as in your example, those 10 people each send the same email to another person I'm afraid SIS wouldn't help you; That mail is a new mail entirely (though largely the same as the previous one) and would consume more storage.

However, ASIS could potentially spot the duplicated blocks which the mp3 is comprised of, and making space savings.

My point is that basically, Microsoft Exchange SIS was never really a feature worth counting on to begin with, so it's disappearance in the name of I/O is and should be viewed as a very good thing...

Besides, it's really nowhere near as exciting as true block level dedupe (A-SIS), or even just the concept of thin provisioning, as with Exchange on DAS (or whatever you're comparing to) you'd still have all that blank space for growth pre-allocated...

Microsoft wants people to go to direct attached storage anyway. Cost of DAS is much cheaper then centralized storage. Not sure what econcomic value if any centralized storage offers over DAS. Justification is much tougher.

@Shan,

The FAS 2000 series starts at under $8000. How much is an HP MSA or Dell PowerVault with similar space and performance?

John

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Subscribe to This Blog


© NetApp, Inc.  |  "Safe Harbor" Statement  |  Privacy Policy