My youngest daughter is going out with a professional soldier. That's tough for her; soldiers in the UK Armed Forces are often on active duty, and he's served several long tours of duty in Afghanistan already, something that she finds difficult to handle emotionally when he's way.
During his last tour, she wanted to send him a bluey. It's the Services equivalent of a telegram; you write your bluey online, it gets sent and printed locally, and most if not all blueys are delivered in 24 hours, even to servicemen right on the front line. It's good for morale, and the service is excellent.
Her first look at the website resulted in tears. When she looked at the information to get the letter to him, she discovered that all she knew was his name, his rank, and that he was in Afghanistan. No other information, and although she thought she knew his regiment, she wasn't sure.
Dad to the rescue. The boyfriend's name is unusual. Very unusual indeed, with a first name from Greek antiquity, and a surname that appears four times in the local telephone directory -- and they're all family members. This guy has a name you'd never forget; a name, like Roman Rock (not his real name, because he's still on active service). If there's another person in the UK with the same name, far less in the Army, I'll eat my shorts. So I calmed her down, and helped here address her bluey, to Roman Rock. That was all the information we could fill in.
It got to him in less than 12 hours. We didn't know where he was located, but the Army did, because he had a unique name.
Chuck Hollis had an interesting blog on filling out blueys. Well, sort of; on what he sees as the death of the filesystem. The topic struck me as interesting; The Future Doesn't Have A File System. As usual, but with good reason, I disagree.
Object-based information stores are different than filesystems in several important ways.
First, you use a token or other uniform identifier to get your information. File systems imply location, tokens don't -- no such thing as a broken link or a moved file system. Not to mention, tokens can uniquely identify gazillions of information objects.
Disagree; filenames don't imply location. They're just names; metadata that allows you to uniquely identify the data you wish to access. This is a problem that the internet created, and that the internet solved; see later.
Second, they have the ability to associate all sorts of metadata with the object itself. As the information object goes, its metadata travels with it. A very useful property indeed.
And that differs how? You can associate metadata with a filename -- if you can do it for an "object", you can do it with a file. (Even old-fashioned filesystems keep things like last access time, size, and the name itself is often used to give further, human readable clarification, such as .html or .doc; although it may lie and we may care not to use it for such, that doesn't change to point of associating metadata with the name-as-a-handle.)
Third, the ability to hang metadata off the object gives us the ability to create all sorts of useful policies and services around the information without having to put everything in some sort of database or repository.
Double eh? If this (it's a GUID, or globally unique identifier);
c2f41010-65b3-11d1-a29f-00aa00c14882
doesn't require a repository or a database for its metadata, I'll eat my shorts. Again.
The internet solved this problem a good while ago. See RFC1737, RFC2141 and RFC3986. The essence of the RFCs; part is a name (like Roman Rock) and part is a location (like Afghanistan). The part that is the name doesn't tell you where the file is located; but the part that describes the location can be completely absent.
An example is in order. To demonstrate how far things have changed since Chuck banged away with Unix pipes and vi in the1900s, this returns a file;
http://blogs.netapp.com/shadeofblue/2009/08/poetry-corner.html
Interestingly, the file doesn't exist until you ask for it, because it's dynamically generated from parts. There's no directory shadeofblue, 2009, or 08, and no file poetry-corner.html either. And (here's a clue how far this goes) it doesn't live on a server at netapp.com either. It's all name and no location, and it works across the entire internet, not just inside a single object store.
Filesystems aren't the problem here; it's an attempt to make Atmos relevant. I think I'll send a bluey to Chuck and let him know; The Future Doesn't Have an Atmos.
I'm off to VMworld next week in San Fransisico. It's my first time at a VMworld conference, and I'm not quite sure what to expect. But I do hope to meet lots of interesting people I've never met before while I'm out there. If you recognize me (yes, I look like the photograph), please introduce yourself. And no, I don't bite!
This is based on deduplicating 4K blocks; and it works on SAN block based and NAS file-based data too. And because NetApp systems use pointers to blocks, reading and writing data in deduplicated volumes is simple. Read a block; get the block pointed to. Update a block or write a new block; write it, point to it, and the system will deduplicate at leisure later. 