Storage Meat: June 2009

When HDS first introduced virtualization on the USP it had an immediate impact on the ability to perform migrations between dissimilar storage arrays. I no longer had to implement third party devices or host based software to perform these migrations. Now I could simply virtualize the array and move the data either to internal storage or another virtualized array. The process was simple and straightforward. The ease of migration and the benefits of virtualization soon led to very large virtualized environments many with hundreds of terabytes and hundreds of hosts.

The challenge came years later when it was time to migrate from the USP to a USP V. How do you move all of the virtualized storage and all of those hosts to the USP V with minimal disruption. Unfortunately there wasn’t a good answer. No matter what combination array based tools I used; replication, virtualization, Tiered Storage Manager, I would still have an outage and would need a significant amount of target capacity. It wasn’t terrible but it wasn’t great either and required a lot of planning and attention to detail.

Hitachi High Availability Manager solves this problem, at least on the USP V and VM by allowing you to cluster USP V arrays. Virtualized storage can be attached to both arrays in the cluster and the hosts can be nondisruptively migrated to the front end ports on the new array. This is a feature HDS has been promising for some time and it is finally here. HDS has made a pretty good case that migrations have been a major problem for large organizations and I would agree we see it every day. So as it applies to migrations High Availability Manager was a much needed improvement. But what about clustering USP Vs for increased availability.

At first glance it is a little more difficult to see the benefits of this configuration. HDS Enterprise storage is designed for 100% uptime, the guarantee is right there on a single page. Unlike athletes, storage arrays can’t give you “110%”. How do you tell a potential customer “The USP V is designed for 100% uptime, it will never go down… but you might want to cluster them”?

But, if you start to think about it there may be some reasons to do this. First of all if you look at where this feature is initially targeted it is at the fortune 100 and environments with the absolute, most extreme availability requirements. Although a single USP V is rock solid and the chances of an outage are very minimal there is the possibility. Clustering USP Vs just adds one more layer of protection. It also silences critics that routinely point to the controller based virtualization solution as a single point of failure.

Consider maintenance operations. Anytime this is done you introduce the human element, “Oops, I pulled two cache boards!” Now you can unclench your jaw and relax while these operations are performed.

The other thing to consider is that clustering is apparently not an all or nothing proposition. I can elect to cluster only the ports and data associated with certain critical applications. (According to Claus Mikkelsen) Let’s assume that an environment already has one or more USP V arrays, you could cluster just the mission critical applications. This seems to me to be much more feasible than simply pairing entire arrays.

From my view I don’t see the ability to cluster the arrays as having the same impact as the introduction of virtualization did, it isn’t as revolutionary but rather it is another incremental step that furthers Hitachi’s lead in the enterprise storage space.

Now for the question of the day, how soon will they have support for host based clusters? If my applications are that critical they’re probably clustered. Even if I don’t intend to run production with a clustered pair of USP Vs it sure would be nice to be able to migrate them nondisruptively!

I imagine most people would be upset if they found out their U.S. bank statements were actually quoted in Canadian dollars. (As I write this the exchange rate is 1.15 CAD to each USD). What looks like $100 in accrued interest turns out to be under $86 when you spend it (in the US).

Likewise, I imagine many consumers of storage wonder what happens to all of those terabytes when they “spend” them. What is the difference between “raw” storage (ie, the bank statement) and “usable” storage (that is, how much you have when you spend it)?

From my perspective, if we’re not talking about storage capacity in terms of how we spend it, we’re having the wrong discussion. Yet everyday, storage consumers accept quotes with storage capacities they'll never see.

Obviously, storage capacity is not the sole determining factor when specifying the Need. In most cases performance requirements, replication requirements, backup windows, data usage methods and patterns, and perhaps most importantly the intrinsic value of the data all contribute to the total solution.

This blog series will meander through these topics over the coming weeks.

Saturday, June 27, 2009

Hitachi High Availability Manager – USP V Clustering

Thursday, June 25, 2009

If Banking Were Like Storage