« I'm Going to Keep Locking My House | Main | "Hacker hits up to 8M Credit Cards" »

October 18, 2005

Typing and Talking for the Rest of My Life

It's getting harder and harder to fill up a disk drive.

If I type for the rest of my life, I won't come close to filling a disk drive. Let's optimistically say I live 50 more years, and let's say that I type 12 hours a day at 60 words a minute - that comes to about 4.7 GB of data, which barely puts a dent in a large ATA disk drive these days.

If I talk for the rest of my life, the stored audio won't even fill a disk drive. If I use a 4kbs codec for telephone quality audio, then the same 50 years of 12 hour days yields 394 GB. Still doesn't fill Seagate's largest drive.

So how do big storage users go about filling racks of disk drives? I have a theory that there are only three ways to generate "Really Big Data". They are:

  1. Generate data by computer.
    • People can't type that fast, but computers sure can. Good examples in this category are computer aided design and Hollywood animation and special effects. Compilers are also a good example. Type in the smallest program you can think of, and then check out how big an executable the compiler spits out.

  2. Get millions of people all typing at once.
    • One person can't type that fast, but a million can. Yahoo!'s e-mail is a good example. Last I heard, Yahoo! had 750 million e-mail accounts. (My Engineering background compels me to admit that only 250 million of those are active accounts - the others have apparently been abandoned.) Other examples would be the transaction records of lots of ATM machines or the access logs of an active web site.

  3. Sample the real world.
    • Typing and talking are slow, but start snapping high quality digital photos or shooting digital movies and you can chew up disk space fast. Commercial examples include seismic data for oil and gas exploration, medical imaging and satellite data.
For really big data, combine more than one of these techniques. Most cash machines these days take photos of people as they withdraw their money: sample the world times millions of people. Oil exploration is another good example. Oil companies start with a seismic image of the ground (sample the world) and then blow it up to many times that size as they analyze the data with seismic processing tools from companies like Landmark Graphics.

I'd be curious to hear an example of "Really Big Data" that doesn't fall into one of these categories, but so far I haven't found one.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2345678/17858702

Listed below are links to weblogs that reference Typing and Talking for the Rest of My Life:

Comments

The comments to this entry are closed.

Subscribe to This Blog




© NetApp, Inc.  |  "Safe Harbor" Statement