Storage Meat: July 2011

Monday, July 25, 2011

Demonstration of Hitachi Dynamic Tiering

On the latest model of the Hitachi Data Systems enterprise storage array, the Virtual Storage Platform (VSP), HDS has included the capability to dynamically tier data at a sub-LUN (page) level. Hitachi Data Systems Dynamic Tiering (HDT) is a technology that enables optimization of an HDS Dynamically Provisioned (HDP) Pool by allocating highly referenced pages to higher tiers of storage. HDT will periodically move pages up or down the tiers depending upon access patterns.

For the purpose of this exercise, we will create a multi-tiered HDP pool. Within the HDP pool, there will be two tiers, one SAS and one SATA. From this pool, we present three volumes to a Windows server and populate each of these volumes with static data. We will initially populate all three volumes with static data, filling Tier1 (SAS) first, with the remainder of data spilling over to Tier2 (SATA). We will confirm our Tier1 is filled with static data by viewing the Tier Properties. Next, we will create what we dub, “active data” with IOMeter and generate I/O to test files on each of the three volumes. After several HDT cycles, we will revisit the Tier Properties to observe the affect that HDT has on highly referenced pages.

Storage tiering is not a new concept and there are a variety of ways to tier storage. Prior to HDT, HDS provided the capability to tier at the LUN level with Tiered Storage Manager. Other ways of tiering include file archiving and virtualizing storage. However, HDT can automatically tier data at a finer granularity without the need to set up policies or classify the data.

Monday, July 18, 2011

Hitachi Dynamic Provisioning (HDP) in practice

We've talked about HDP on the blog a few times before (here and here, for example). And with the advent of the VSP, we've moved into a world where all LUN provisioning from HDS arrays should be done using HDP.

In brief, HDP brings three different things to the table:

Wide striping - data from each LUN in an HDP pool is evenly distributed across the drives in the pool.
Thin provisioning - space is only consumed from an HDP pool when data is written from the host. In addition, through Zero Page Reclaim (ZPR), you can recover unused capacity.
Faster allocation - In a non-HDP environment there were two options. You could either have predetermined LUN sizes and format the array ahead of time, or you could create custom LUNs on-demand and wait for the format. With HDP you are able to create custom-sized LUNs and begin using them immediately.

Most of our customers move to HDP as part of an array refresh. Whether it's going from an AMS 1000 to an AMS 2500 or a USP-V to a VSP, they get the benefits of both HDP and newer technology. While this is great from an overall performance perspective it makes it difficult to quantify how much of the performance gain is from HDP vs. how much is from using newer hardware.

We do have one customer with a USP-VM that moved from non-HDP over to HDP, though, and I thought it was worth sharing a couple of performance metrics both pre- and post-HDP. Full disclosure - the HDP pool does have more disks than the traditional layout did (80 vs. 64), and we added 8 GB of data cache as well. So, it's not apples-to-apples, but is as close as I've been able to get.

First we have Parity Group utilization:

As you can see, back-end utilization completely changed on 12/26 when we did the cut over. Prior to the move to HDP parity group utilization was uneven, with groups 1-1 and 1-3 being especially busy. After the move utilization across the groups is even and the average utilization is greatly reduced.

Second we have Write Pending - this metric represents data in cache that needs to be written to disk:

Here you see results similar to the parity group utilization. From cutover on 12/26 until 1/9 write pending is basically negligible. From 1/9 to 1/16 there was monthly processing, corresponding to the peak from 12/12 to 12/19 in the previous month, but as you can see write pending is greatly reduced.

The peak in write pending between 12/19 and 12/26 is due to the migration from non-HDP volumes to HDP volumes. In this case we were also changing LUN sizes, and used VERITAS Volume Manager to perform that piece of the migration.

The difference pre- and post-HDP is compelling, especially when you consider that it's the same workload against the same array. If you're on an array that doesn't support wide striping, or if you're just not using it today then there's an opportunity to "do more with less."

Monday, July 11, 2011

ESX Site Recovery Manager with NetApp Storage

I'm happy to report that configuring SRM using NetApp storage and SnapMirror is a relatively straightforward operation. That is to say, not any surprises and things pretty much work like you'd expect. The nifty thing about NetApp is that it doesn't require identical arrays at each site; in fact, you can have small regional VM farms (running on something like a small workgroup FAS2040) and SRM those back to your core datacenter running a big-dog FAS6280. I don't have that kind of horsepower to play with in the lab, but down below I'll show you how I protected a FAS270 from yester-year up to a larger 3020 array. And for fun, I did it across a T-1 line. Your mileage may vary, and your co-workers will likely be peeved when you hog up all the bandwidth during that initial sync (I know mine were). Incremental sync jobs afterwards didn't produce hardly any complaints, by the way.The video doesn't detail the initial software installation. Suffice to say, you'll need the SRM installalable from VMware and the NetApp SRA, which is helpfully called the NetApp Disaster Recovery Adapter (NOW login required to download) if you're searching for it. Both should be installed on a dedicated SRM systems, one at the primary site and one at the recovery site.
Other things you will need:

A NetApp head at each site, running at least OnTAP 7.2.4

A SnapMirror license installed at each site

A SnapMirror relationship defined and established for your primary datastore

A FlexClone license (required only to enable the test failover function, as demonstrated in the video)

There's a couple of 'gotchas' when planning this configuration too, at least with 1.4 version of the SRA:

The datastore FlexVols can only have a single SnapMirror relationship, which is to the secondary location. No daisy-chains. This also limits the ability to have multiple recovery sites for a single primary site.

Replication should be done with plain-old Volume SnapMirror. (Qtree-SnapMirror might work and isn't explicitly unsupported, but would be an unwise plan).

SyncMirror however is explicitly unsupported in conjunction with SRM. That should present less of an issue. If you're lucky enough to have SyncMirror, your single ESX cluster should probably span the two sites. So no SRM required. You can still run regular SnapMirror along with SyncMirror to get the VMs off to a third more distant location.

Monday, July 4, 2011

Storage Performance Concepts Entry 5

Cache Architecture - Part 1

The last area we want to cover is cache. Cache or Cache Memory is just that – memory / DIMMs installed in an array to serve as a high speed buffer between the disks and the hosts. Most storage arrays are what are referred to as cache centric, meaning that all reads and writes are done through cache not directly to disk. In addition to user data / host IO, cache can be used to store configuration information and tables for snapshots, replication or other advanced features that need a high speed storage location. The data in cache must be protected and this is most commonly done with mirroring. In some cases all of the data is mirrored, in others the write IOs are mirrored while reads are not, since the read IOs already exist on the disk.

A common question is “in an array with 16GB of cache how much is really available for user / host IO?”

The exact details depend on the array and the configuration you have but the following concepts should be fairly constant. For example I am using an HDS AMS 2000 Series array.

· 16GB of Cache (8GB per controller)

· A percentage of cache is dedicated to the system area. This varies depending on the hardware configuration and whether or not features that use cache such as replication or Copy on Write Snapshot are enabled. Assuming that replication and Copy on write are not in use 3,370MB total or 1,452MB per controller will be dedicated to the system area leaving 13,480MB or 6,740MB per controller.

· Next each controller mirrors its cache to its partner, 13,480MB becomes 6,740MB or 3,370MB per controller.

· The last calculation depends on the type of IO. All arrays deploy some mechanism to keep the cache from being overrun with write IO requests. How this works is, a threshold is set that when met tells the array to begin throttling back incoming host write requests. In the case of the AMS 2000 series that threshold is 70%. Note that this is for write IO not reads. In a worst case scenario when performing 100% writes the available cache is limited to 70% of the 6,740MB number - 4,718MB total or 2,359MB per controller.

Looking at these numbers many are initially surprised by how little cache is actually available for user IO. It’s interesting to note that we rarely have cache related performance issues with these arrays. The reason has to do with the way that cache operates in a modern storage array. The following diagram created by our CTO and illustrates the relationship between cache and the physical disk drives.

Cache to Disk Relationship

The cache is wide, it provides a lot of IOPS but it is shallow – there isn’t much capacity. The disks are deep, they hold a lot of capacity but individually they aren’t particularly fast. What seems to be impacting the relationship most significantly is wide striping. Wide Striping allows you to pool multiple array groups and more effectively distribute IO across more disks. The result is that writes in cache are flushed to disk more quickly, keeping cache available for incoming IOs. Referring back to our funnel diagram we are essentially widening the bottom of the funnel. Wide striping is not unique to HDS, it is a common feature on many arrays and it just one example of how storage vendors attempt to balance an array. In our next entry we will take a look at the role of cache with various workloads.