January 11, 2009

R is for Recovery?

The requirement is always to provide a method of recovery for the corporate data.

But what does that really mean?

What are the primary requirements to protect data from?

Failure in the data centre of the Server, storage, or network?

Data loss thru accidental deletion, database corruption due to user error, or application failure?

Whatever the reason, the requirement is to get the data back into a format that can be used by the business and its' customers. FAST

There are differences however, between restoring data from a local backup, recovering remotely after a disaster (Disaster recovery) and the high availalability (business continuance) model, just keep running no matter what happens - go remote site live, stay online. 

Traditional recovery is tape based, therefore the Restore Time Objective should be measured in hours,often many hours to find the required data to restore.

Enter the world of Snapshot!

As a user who used this to protect a large amount of the companies data- this is an awesome recovery tool. Restore a database to its previous point, before an application change in minutes! It doesn't get any better than that.

The biggest problem I see, and it existed when I managed the technical infrastructure for storage, is that the applications teams don't get the fact the world moved from tape, and don't seem to get that there is a whole new world of recovery options to make the admins life easier, deliver a better service to the customers, and preserve the integrity of the data for the business.

Just food for thought.

October 09, 2008

How Green is Green?

Green I.T

How Green is it really?

Well  I couldn't help myself, after a rational discussion with an analyst recently, I found my self trying to answer the question posed.

Is it about power, saving energy, reducing carbon monoxide generation, and reducing the hole in the Ozone Layer?

Is it making the plastics from recycled products or making them from products that can be recycled. Do you recover the gold from the circuit boards ?,  Is the copper recycled from cables?

Its a concept, its a goal, its a strategic way of thinking.

Assume you use a smaller disk, it takes less energy relative to its larger cousin to spin, therefore must generate less heat, so power is saved going in to the device, and power is save reducing the heat generated by the device. Cooling energy costs can also be reduced.

But what about moving the datacentre from the sunny climes of California, or Arizona to the cooler climes of Washington State? No disrespect intended, but the climate is much cooler for most of the year.

Why do this? Imagine taking in the cold air from the north, it doesn't need much actual cooling, therefore providing a natural energy saving effect!

That's Thinking Green I.T.!!

September 28, 2008

What is in a Name?

"[ is] a technique for hiding the physical characteristics of computing resources from the way in which other systems, applications, or end users interact with those resources. This includes making a single physical resource (such as a server, an operating system, an application, or storage device) appear to function as multiple logical resources; or it can include making multiple physical resources (such as storage devices or servers) appear as a single logical resource."

Simulated; performing the functions of something that isn't really there.

Twice as many Australian businesses are using virtualisation than the rest of the world, according to analysts.

Now that we understand that it is magic, the rest should be easy huh?

Well I am sure the myriad of companies that have sprung up around consulting on "virtualisation" would beg to differ, but as a cynic one could assume that it is financially motivated.

I guess I like the idea of of performing a function of something that isn't really there- but then much has been said about storage management for years- perhaps in the new world of virtual storage we should aim for "virtual storage management", it will make it easier to manage. While we are at it, perhaps we should back up the virtual environments, that would become "virtual tape backup" or for the trendsetter's in the industry, just "virtual backup".

But it cant be abbreviated- VB is already taken, in California it is software, in Melbourne its a beer. You work out which is more popular!

End result is- if we (Australians) are twice as far down the virtualisation path- we will be drinking the VB first.

Game on!

September 14, 2008

Time Warp

Blogging is an interesting challenge- to be short (text wise) sharp and to the point- usually. This is hard to do when laid out with Viral Influenza, and therefore full of enthusiasm and joy for the world in general and blogging in particular. I find my self now (almost) fully recovered and sitting in an airport awaiting the convenience of the airline, ground staff, engineers or whomever has caused me to sit here for an extra 3 hours of my life, un rewarded.

If the Absolute Architect would lend me the Flux Capacitor, then perhaps I could amend the tardiness causing me to blog for your reading pleasure and I could actually complete my trip to represent the company.

Life is grand, it will all happen eventually. Some of us have a stop watch on eventually, and not a calendar!

August 07, 2008

When is an Archive not an Archive....

Perhaps when it is actually a backup kept for archive reasons.

For most of my time in the storage arena, this has been one of the most commonly  confused issues to get clear understanding on.

A backup is a copy of the data that can be used to recover lost files or data. It is nothing else.  The long term retention of data, from a particular point in time for legal or compliance reasons is an Archival or Compliance copy. They are not related, other than, they both originated from the same source.

In this day and age of data explosion, we really need to rethink the ways of the past. We have for the last 3 decades been slavishly shrinking the backup window, and competing in the daily "Race to Sunrise", using the same philosophies as we used in the mainframe data centre over 30 years ago, but are they right for the world we live in today?  Back then a disk enclosure was 315 Mbytes, they did fail and we did need to recover them more often than we care to remember. There was no RAID, or even mirroring of data. 

Today we have 1Tbyte Drives, lots of them, in RAID Enclosures, often capable of sustaining 2 disk failures without data loss!  We have already changed the way we manage and handle data, so why do we insist on going backwards with recovery??  Probably because we didn't tell the "Backup guy" that we had changed the way we do, what we do.

So if we can now protect against a double disk failure, and we could take 4 snapshots or more a day, What does that do to recovery?, makes it nearly instant- that's what it does!!  And if I can now do that remotely - I.e. to another physical location, then we have achieved a DR capability as well.

So if the technologies to do this exists, and many have done for four to five years, whey do we still deploy complex software applications and tape libraries to back up data?  Traditional Thinking!

Current economics would indicate, that because we only snapshot the changed blocks and not the complete file, we save greatly on space, compared to tape over a number of years the savings are huge.

Maybe its the archive component.... But wait, if we archive data to tape,  to be kept for many years, then we will have to write routines to exercise the tape, stop it getting stuck to itself and magnetically "bleeding" into the next loop. Then periodically we have to test we can read the data also. Then we have to keep a copy of the software that put the data there, the application that reads the data, and the tape drives themselves to recover the archived data in the future. That means at least two generations of technology to manage a seven year archive cycle, plus maintenance and spare parts for discontinued  hardware. Plus the engineer who knew how they worked 7 years ago!!

What if we actually took the server, application and data, made a virtual image and stored it on disk?  If that reporting period is required, we run it up on the Virtual environment, access the data, deliver the required information, then shut it down again. Ideal for those applications that are superceded, where we still have a requirement to keep the data for legal or compliance reasons.

So an archive would live just as well on lower cost protected disk, and because it isn't changing, its static data, the backup to disk is  only really required once!

Maybe we keep deploying the complex backup software, because the software people still sell it!

Food for thought.

July 27, 2008

Haven't I seen this, before??

Okay, so its hard to admit,  but I have two and a half decades in IT, most of which has been in Storage.  I must have started very young - NOT!

Beginning with mainframe, seeing that transcend into "open systems" and "mid range" and eventually to the Intel layer, has been an interesting but somewhat repetitive exercise.

Large centralized processing environments, partitioned into multiple Lpars, Images or domains. Running products such as VM (Virtual Machines) or MDF (Multiple Domain Facility) -It depended on which vendors hardware you had!  Today the big  trend is to virtualization, take a large processor, divide it into multiple systems, running more efficiently.   Hmm... Familiar.

Centralized processing  gave way to distributed processing, and departmental computing. Having reached a point where environmental's, management and costs are now out of context, we see the Virtualization trend, driving us back to centralizing management of the data processing, and significant cost savings.

HSM was the original Mainframe management tool for data migration to lower cost of storage pools, based on access age.  Back then we had 2 tiers of storage Disk and Tape, Excluding SSD or solid state disk, which had its own personality in the tiering game.

Today SATA drives open the gate to greater tiering capability, both within Primary and Secondary storage. HSM has been replaced? by almost 20 different products that almost do the same thing.

Today Solid State or Flash storage is making an entrance. We used that in my early days as  a "drum storage emulator" 2305 for those that remember. It was a great place to put database indexes for IMS, or Model 204. Now I hear some bright sparks have had an idea that this can be done using Solid State Disk to speed up access times!! Brilliant!

I could go on, and probably will in another post, but i think you can see where i am headed with this.

Like the song says- "Everything Old is New Again"

July 18, 2008

Whatever happened to ILM?

I recall clearly, sitting in a darkened room, overhead projector humming, listening to how storage costs will be reduced and efficiency increased by ILM. Now the geek in me wondered what Industrial Light and Magic had to do with this, and of course it was nothing! It was all about Information Life Cycle Management.

Over the next few years a troop of experts from companies around the globe came and told any who would listen about the value and beauty of ILM. But no –one could tell me with any degree of conviction, where it had been done and who was benefiting from it.

Imagine a domestic environment- lets use “Chez Geekazoid”, and we have between 4 ands 6 terabytes of data. Ask the owners of that data to go thru the almost 1 million files that this represents and ask them to classify the data into one of eight or nine groups, representing 600 to 800 different classifications, and have their data moved and stored accordingly - NOT Likely!!

At the time I was the storage manager for a large operation, with approx 12-14 Petabytes of data, 15,000+ servers and 1200 applications. And you want me to classify all that?? Same response as from the Geekazoid household – NOT Likely!!

The primary premise in ILM was that data becomes less important over time, Wrong! If the data was important when we wrote it, it must still be important, else why did we write it?

Where were the tools to asses data? Every time I asked, I heard “ Every organisation has their own needs, it has to be decided internally”   Well with 1200 Application owners, I could see that Consensus was just around the corner, probably sharing a cubicle with Nirvana!

Over time it became apparent, the tools to classify data were at best clumsy and bespoke in their utilisation, the time taken to actually go thru data, groom it, classify it, and report it was bigger than running the operation of any reasonable sized data centre. The tools to manage the movement of the classified data to its resting place of choice/classification, were also not fully developed and functional. And at the time the tiering of storage also was not particularly well suited, as many believed that tape would be the final resting place for data.

I remember –not very well but I do remember a thing called HSM. It was in my mainframe days, it moved data from 1 place to another, based on the age of the data. Very clever it was!! The system knew where to find it again too, when it was required.

That reminds me, on the mainframe we had a thing called VM or virtual machines, it let you run many machines inside a single physical entity. That was very clever!

But maybe that’s a topic for another day.

Hmmm- I wonder what happened to HSM?

© NetApp, Inc.  |  "Safe Harbor" Statement  |  Privacy Policy