November 03, 2008

The Future of Fibre Channel is Ethernet

I worked for Brocade, a Fibre Channel switch company, from 2000 to 2005. About my third week on the job, Nishan ran full-page ads in the Wall Street Journal announcing Storage over IP (SoIP) and the imminent death of Fibre Channel. I suspect the ads cost Nishan more than the entire lifetime revenue of the company, but it did kick off a war that lasted 6 years - the iSCSI vs. Fibre Channel war.

In the beginning, the debate was like CNN Hardball - lots of dogmatic arguments and very little listening. When iSCSI products came to market and matured, it became clear that iSCSI would dominate the market for low-cost block network storage, and Fibre Channel would remain king in the high end. The problem with this for customers is that it forced them to make a major choice in storage switching infrastructure based on some pretty subtle differences. Lots of money was wasted putting low-end application servers on expensive Fibre Channel networks.

Never bet against Ethernet in the long run (remember token ring?). The Fibre Channel community - including Brocade, Emulex, Qlogic, and Cisco - has come together with the Ethernet community and defined two new standards that will allow a graceful migration of Fibre Channel networks to 10G Ethernet over the next several years. Data Center Bridging (DCB) is a set of extensions to 10G Ethernet that add the flow control and traffic prioritization that made Fibre Channel well suited to storage traffic. Fibre Channel over Ethernet (FCoE) makes the Fibre Channel framing and management protocols work over layer 2 Ethernet. It can't be directly routed over WANs since it does not use the IP layer, but then neither could Fibre Channel. FCoE also leverages much of the management tools and host-side driver work in the new Converged Network Adaptors (CNAs) that attach to the DCB 10GE network.

DCB is not just for FCoE. The “lossless” characteristics will also help other services such as NFS, CIFS, and iSCSI. All of these can run alongside each other on the same physical network. This is the real win for end users.

It will take a little while for all the standards to settle out. FCoE should be final by the middle of 2009 and DCB by the end of 2009, but there are first generation products out now. NetApp will ship FCoE target connectivity in our FAS systems around the end of 2008.

The net effect is twofold:   Customers looking at building new Data Centers in the 2010 and after timeframe can choose to use a unified fabric technology - 10GE with DCB - for all of their server-server and server-storage needs. This kind of volume adoption will drive cost and prices down - something which the duoplistic nature of the Fibre Channel industry could never achieve.

In the near term, Fibre Channel customers can extend their fabrics using switches that bridge 10GE to Fibre Channel like the Cisco Nexus systems. New servers can be attached to 10GE using CNAs and access the Fibre Channel attached storage already in place. So customer can migrate gradually, or do it all at once with a new facility.   

So does this mean the death of Fibre Channel? Not any time soon since there is so much of it out there. But I would bet that the generation beyond 8G FC will never see much adoption. By the time it might be available, 10GE adoption will be well along and 40Gbit Ethernet will be on the horizon.

I have also been asked if this means the death of iSCSI? Absolutely not. First, customers can run iSCSI and FCoE over the same 10G Ethernet DCB fabric.  Some servers using iSCSI, some using FCoE depending on their needs and past. The physical network - the real investment - is the same. iSCSI also will continue to be the only block data protocol running over 1G Ethernet which will be around in Data Center for a decade or more.  

The future is set. The only question is how fast it gets here. I believe that half the applications using Fibre Channel attachment today will be migrated to Ethernet within 5 years - the end of 2013. Virtualization will lower the absolute number of servers and ports, but by that time the trend will be unstoppable.   

So what should IT managers do? If you are not planning a new storage fabric, there is no rush. If you are adding to your Fibre Channel fabric a few ports at a time, keep doing that since it works. Let the early adopters get some experience with the FCoE adaptors and DCB switches over the next year. If you are building a new Data Center or storage fabric for deployment in 2010 or later, you need to understand the 10GE option. It will most likely save you money. It is definitely the way the industry will go in the long run.  I would hate to be the guy who put in the LAST new Fibre Channel network.  

September 22, 2008

NetApp and Brocade's Encryption Partnership

Back in June, I had the fortune of attending Game 4 of the NBA finals between the Lakers and the Celtics courtesy of a good NetApp partner, Insight Investments. I also had the misfortune that night of having my briefcase stolen from the rental car in the parking lot.   

That night gave me a personal glimpse into the importance and complexity of key management.

If your laptop is like mine, you have all kinds of website passwords stored on it for the convenience of not having to remember them when you travel.  As I flew home, my level of panic grew as I calculated the financial havoc the thief could inflict if they broke through the top-line login. I got home at midnight and spent the next few hours changing logins and passwords on dozens of financial, storefront, and other sites. in doing this, I realized I had used the same two or three passwords for everything because it was easy for me.  Which made it easy for the thief. This prompted me to develop a more secure method of creating, using, and remembering personal passwords for the diversity of digital domains in which I dwell. My "system" is separate from my laptop or desktop so I can use it with either device, and avoid the problem of someone stealing it along with my data. I put my "system" in more than one place to protect against physical loss.  I also thought about what a pain it was and how it would not scale if I added more than the few dozen sites I use now.

I'll get back to this in a minute.

NetApp and Brocade announced a data security partnership today. Brocade has new blindingly fast Fibre Channel switches and director blades that integrate almost 100 GB/s of encrypting bandwidth. We worked with Brocade to ensure that the encryption/decryption capability of this switch is compatible with the NetApp DataFort, and NetApp will resell the Brocade products as our next generation FC DataFort. We always expected that encryption would become a feature of storage devices, tape drives, and fabric switches and this was our strategic intent when we acquired Decru 3 years ago.

This kind of interchangeability of encryption devices depends on centralized, strong key management. NetApp’s Lifetime Key Manager was designed to support multiple encrypting devices. It supports DataForts, Oracle Advanced Security Option, (come see this at Oracle Open World in San Francisco this week) and now Brocade.  It also enables millions of keys to be shared between multiple locations.  Keys can be automatically restored to a device that has been replaced, and are protected in a FIPS-140-2 Level 3 standard strongly secured system.   

Encrypting data solves a broad class of risks of unauthorized access.  Encryption requires keys. Unless a company decides to use the same key for all data they encrypt, (which has about as much security as Sarah Palin's email) they need to manage those keys.  And change them.  And be able to move them to DR sites.  And be able to recover them.  It is not a trivial task.      

Unlike my little system for keeping track of passwords, it is certainly not something that you can do manually.  The NetApp Lifetime Key Management (LKM) system will do all of this for you across a range of encryption devices.

There are several thousand DataFort systems installed now at companies like Iron Mountain, Qualcomm, CNL Financial, and Regulus Group.   There are hundreds of thousands of disk volumes and tapes encrypted with DataForts using keys stored in LKMs. The combination of Brocade's new fabric-based encryption with NetApp Lifetime Key Management will advance the state of the industry in making data in enterprise datacenters more secure.    

July 21, 2008

Flash Forward

A friend of mine asked whether he could string together a bunch of flash-based iPods and build an enterprise storage array. While this seems like a crazy idea (imagine the mountain of discarded white ear buds) there is no question that flash memory in some form will become a big part of enterprise storage. Flash is fast – it is somewhere between DRAM and 15k rpm disk in terms of IOP/sec. It is also expensive – at least 10x the cost/GB of the fast est disk, but the prices are falling fast. The performance assures that flash will be of great benefit when used to store the most active data on an array. The cost will determine just how much of the disk market ultimately converts to flash. No matter what, it will be there in a very important way.

Flash will emerge in several forms in enterprise storage. Enterprise quality SSDs are becoming available now as an alternative to 15k rpm disks in storage shelves. While performance varies greatly across vendors and read/write mix, they are very fast – 5000 IOPs and up vs. about 300 IOPs for a 15k FC disk. This means a few SSDs will deliver more IOPs than a full shelf of partially filled 15k drives. They take less power and less space too. NetApp is in the process of certifying enterprise-grade SSDs that you can use in our existing storage shelves.

But flash is memory and is fast enough to be a layer of cache in a storage system. Imagine having a terabyte or more of very fast and low-latency cache to hold the most frequently accessed data in your array. NetApp is shipping a plug-in DRAM cache card today and we will offer a version using flash chips next year.

One compelling advantage of using a cache approach is that you don’t have to manage another “Tier” of storage – the system automatically puts your most active data blocks into the fast flash storage.

This makes many disk data placement science projects unnecessary since the most active data will remain in the large flash cache. Not just the data you ‘think’ will be hot – actually the data that is hot. Manually planning disk data placement for performance reasons was fun in the 80s, but customers I talk to seem to care much more about saving time and increasing the agility of their infrastructure than mastering the eccentricities of their storage systems.

In addition, that cache can be deduped so that it won’t fill up with identical blocks from multiple VMware images (NetApp does this today). If you define a policy that certain data volumes are more important, they can either be pre-loaded in cache, or designated to never be kicked out of cache. Or you can pin metadata in cache ahead of data. Lots of ways to optimize here using policy, not people.

For the next few years, you won’t be using a lot of flash capacity in your systems, not just because of the costs. At 10x or more the IOP rate of hard disks, it only takes a small number of SSDs in disk slots to saturate the performance of the array controller. It’s like trying to fly a model airplane in your living room – you’ll run into a system performance wall long before you hit capacity limits. This is another reason that flash as cache is economically efficient – it puts the necessarily small amount of very fast storage at a point in the architecture where you can best exploit the performance.

Flash is hot. While there is probably more smoke than fire right now, it will definitely produce significant improvements in enterprise storage and application performance. SSDs will be the first wave and will be easy to plug in. But the real innovation will be in how enterprise array designs adapt to embrace flash. Then the fun starts.

June 10, 2008

Transforming the Engineering Data Center

We announced several new products today at NetApp that I am especially excited about.    A new midrange storage system, the FAS3100, and a pair of new caching technologies called the Storage Acceleration Appliance (which is an NFS cache appliance that uses our FlexCache software) and the Performance Acceleration Module (which lets you expand the memory cache inside of the NetApp storage controllers).  

Many of our customers use NetApp storage systems to hold the data for engineering, HPC, scientific and other "technical" applications.    This includes software development , seismic analysisresearch, genomics, semiconductor design, computer animation, and many more.    These are the applications that drive the revenue of these companies - the top line, as opposed to the applications that drive efficiency of operations - the bottom line.    Anything that can be done to make these revenue producing applications run faster has a meaningful impact on the revenue growth of these companies.

Most of these applications need a lot of compute, but most are also constrained by the speed of their storage.    Compute speeds have grown much faster than disk drive IOPS over the past several years so anything that can be done at reasonable cost  to speed up the delivery of data to the app is a good thing.

The announcements NetApp made today do just this.    The Storage Acceleration Appliance is an easy-to-manage caching appliance that allows a lot more application servers to get access to the same set of files by spreading copies of them across more storage controllers and more drives.    These appliances can also be deployed around the world to deliver high performance for distributed workgroups.   NetApp uses them in our own software engineering groups.    Since they are caches, the data on them does not need to be managed.   All backup and other management is done on the 'source' system that feeds the caches .

The Performance Accelerator Module expands the size of the DRAM cache inside the NetApp storage controllers in a smart way.     Since DRAM is several orders of magnitude faster than spinning disk, a application request for data in cache will be serviced with much lower latency.    More cache means more data blocks will be served from memory.   You can also choose to cache just metadata or both data and metadata.  A test of an average NFS workload doubled IOPS at constant latency.   Nice.   Plus, it can be added to many of the existing NetApp systems installed in the field and we have a software tool to verify in advance that a bigger cache will help.   Even nicer.

I love working with the customers who run these types of Apps.    They are working on the frontier of knowledge and are building the future.    They are also constantly pushing the envelope of what their storage and compute systems must do in the quest for faster product development, faster data analysis or better science.     By doing a better job for them, we do a better job for all of our customers.

Go further, faster.    

May 29, 2008

Marketing With Integrity

Steve Duplessie referred to me as "a marketing guy" in a recent post.   Kind of understandable given my CMO title.    But I also bristle at the label a bit since it feels confining.

My wife is a teacher.   I think we have stayed married for almost 25 years because we both enjoy seeing the light go on in someone's eyes when they realize they have learned something new.    She got to do that with kids in the classroom.  I get to do it every now and then when I talk with customers.    Helping people understand how to solve a problem is what really excites me.     I get to learn from them as they describe what they are trying to do, and sometimes I can describe a new product or technology that will help them get it done.    It is a win-win.

While there are many aspects to the function of marketing, I believe the essence of marketing is about teaching.   Good marketing focuses on how customers learn, and how to reach them with just the right information at the right time to address the challenge they are trying to solve.      Marketing often gets a bad rap because the timing of problem and solution don't align, and customers find themselves barraged with information they can't use at that time.    Reminds me of 10th grade biology where I was buried in information that had no relevance to me finding a date for that Friday night.   

I have done "marketing" roles for most of my career, but those roles included technical teaching, product planning, and conducting technical customer councils.   I also did some software engineering management, and was even a Chief Technology Officer at Brocade.   Lots of time talking technology with practitioners of IT.    I think of marketing as the function that translates between customers with business and operational needs and engineering teams anxious to make a difference by solving them.    

Remember the teacher that really lit up your mind at some point in grade school?   For many people, it was a single teacher that set them on a career path by finding the link between their passion and the real world.     But also remember how hard that teacher had to work to simply get and hold the attention of the class?   He or she was "marketing" learning to you - sometimes you were buying and sometimes you were not.     A great deal of marketing activity is simply to try to catch your attention. 

The analogy is not perfect.  As a child, you had required courses to take.    And the teacher did not starve if you did not learn.   But the principles were similar.    If marketing is the mutually profitable exchange of knowledge between consumer and provider with the goal of stimulating informed action, then I am proud to be a "marketing guy."

May 14, 2008

Scattered Clouds

I have been hearing more and more about “cloud computing” and how it will become the nexus of everything new in computing. This trend reminds me of an alien abduction – when an innocent victim is grabbed from their bed by beings from a different world, poked and probed, and returned with a vague notion of having been violated.

Amazon, Google and IBM were the first to really put cloud computing out in the public eye. Amazon targeted small businesses with their S3 and EC2 services offering compute and storage for a metered rate. Google came out with the idea of allowing 3rd parties to run massively parallel applications on the Google compute cloud. IBM jumped in with their Blue Cloud initiative targeted at a similar audience of people looking to build new types of applications that require public access to a massive shared grid of compute nodes. Many companies in research and commercial industries have built grids of compute nodes, but these are all private facilities that run that company’s applications. Amazon, Google and IBM have a new idea – or at least are talking about a new way of democratizing the technology – and it deserves a new term. Cloud is cool.

Like any new term that catches the imagination of the market, a lot of companies tend to pile on and abduct it. They then use the new buzzword as an umbrella term for a wide range of things that already exist, may never exist, or never deserved to exist in the first place. I wonder if the guys at Google, Amazon, and IBM are feeling that vague sense of having been violated.

I have heard cloud computing, Web 2.0, Software-as-a-Service (SaaS), Online backup, Enterprise grids, and even email and messaging all be jumbled up in conversations with usually savvy people who have just been confused by the blizzard of abstractions. In a recent Business Week interview, Shane Robison of HP refused to define cloud computing, which is probably wise – it is too new an idea to constrain yet. However, I believe there are several things being called cloud that definitely are not new, and should not be used to pollute what cloud will become.

Software-as-a-Service (SaaS) – SaaS is application software that runs on the service provider’s shared infrastructure that you can use via a web interface. Salesforce.com is probably the best example. Google apps are an emerging example. It is different from application hosting, where a client company rents computing infrastructure and applications that are dedicated to them. Hosting has been around for years. Oracle and SAP offer large enterprise application hosting services. SaaS is new, and is a new way to offer software, but is different from cloud computing in that the applications available are chosen by the service provider, not the client customer.

Enterprise Clouds – I have heard people refer to their company’s compute grid as a “cloud” which may be conceptually accurate, but it is a very different animal than a “public cloud.” We’ve called these ‘grids’ for some time and people generally know what that means. Why fuzzy it up?

Managed Backup Service– A number of companies are offering online storage capacity for a metered fee. There are scads of consumer or SMB-oriented companies offering this service, and a handful of enterprise level services from companies like Iron Mountain or Symantec have also emerged. This is a pretty well understood idea. Definitely not new. Definitely not “cloud.”

Storage-as-a-Service - This idea goes back a long way, with Storage Networks being the most spectacular failure in the enterprise segment of this market. The idea of putting the data in the network accessed by applications either on premise or in another network just seems to add more complexity than is needed. Either remote it all (application hosting) or keep it all (enterprise computing). Amazon’s S3 is this type of service, with the expectation that only applications that can live with the service level and performance will use S3. I would bet that most S3 customers also pick up the companion offering for compute, called EC2. The combination of S3 and EC2 definitely qualifies as “cloud computing.” Individually, S3 is a stretch to be called cloud computing.

Web 2.0 – This is a great example of coining a generic term and then allowing the definition to evolve. To most people, Web 2.0 is web services which support interaction and collaboration over the internet, as opposed to Web 1.0 which was either reference content or commerce. Facebook, MySpace, photo sharing, instant messaging, even email would count in this very broad umbrella term. Ultimately, Web 2.0 and cloud computing may come together if a new class of cloud-based social networking applications emerge, but that is not here yet.

I think what contributes to the confusion is that all of these ideas depend on common infrastructure technologies to deliver. All of them need scalable compute. All of them need scale-out storage to support the large data requirements. This encourages the infrastructure vendors to generalize and call them all by the hot new cloud computing term. To the vendors, the distinction between these services is inconvenient. To the providers and users of these services, the distinction is everything. New ideas need some room to be different and establish a unique added value. Most of all, they need to be developed and defined by the people delivering them, not the vendors supporting them.

It’s sort of like “Green” computing. I am all for environmental awareness and reduction in resource consumption. But we’ve all seen some pretty routine activities lumped under a company’s “green” initiatives. It’s like the marketing groups woke up one day and suddenly had to tell their story with a “green” filter. Perhaps they had been brainwashed in the night. Perhaps they had alien visitors.

Perhaps they were little green men….

© NetApp, Inc.  |  "Safe Harbor" Statement