Storage Meat: August 2010

Monday, August 30, 2010

RedHat Linux BMR Boot Server in vSphere

Symantec's Bare Metal Restore (BMR) feature in NetBackup 6.5 supports recovery of RedHat Enterprise Linux 5.x, but it takes a little effort. The nice thing is that the BMR boot server can be a VM so long as it is located on the same subnet as your BMR clients (strictly speaking this is not required but simplifies the installation, especially in existing networks where modifying DHCP settings is frowned upon or just plain not likely to happen).

NOTE: These instructions require an EEB for Symantec Bug ID 1999920.

Ensure your BMR boot server and clients have forward and reverse DNS records. BMR in NBU 6.5 comes with the standard client, but you may have to get a free license key from Symantec support.
After installing RedHat add the compat-libstdc++, tftp-server, and dhcp packages.
Enable TFTP by setting disable=no in /etc/xinetd.d/tftp and then /etc/init.d/xinetd restart
In /etc/dhcpd.conf:

Add ddns-update-style ad-hoc;
Add subnet w.x.y.z netmask a.b.c.d { default-leasetime 600; max-lease-time 7200; option domain-name "domain.com"; option broadcast-address p.q.r.s; option domain-name-servers ip1, ip2; option routers ip; }
Restart DHCP /etc/init.d/dhcpd restart

Install the NetBackup client software. You could install a media server instead, but for simplicity I recommend sticking with the NBU client software.
If not already installed, install NetBackup BMR onto the master server and run bmrsetupmaster.
Install the NetBackup BMR Boot Server for Linux (CD4) onto your client and run bmrsetupboot.
Install the EEB mentioned above (this requires a call to Symantec support).

Now, it's time to create a shared resource tree (SRT), which requires a little trickery. Here, I'm using the 64-bit DVD image. There are several issues we have to resolve a priori:

V-125-380 /rhel54dvd/.discinfo(2): expected "Red Hat Enterprise Linux Server 5", got "Red Hat Enterprise Linux Server 5.4"
The loaded media is not correct ... please try again.
V-125-380 /rhel54/.discinfo(4): expected "1", got "1,2,3,4,5,6"
The loaded media is not correct ... please try again.

Now, for the procedure:

Loopback mount the RHEL 5.4 DVD to a directory, say /rhel54dvd.
Loopback mount the NetBackup BMR Third-Party Products CD (3PPCD) to say, /3ppcd
Loopback mount the NetBackup client CD to say, /dvd1
Create another directory, say /rhel54 and cp /rhel54dvd/.discinfo /rhel54
Edit the .discinfo file and change the 2nd line ("Red Hat Enterprise Linux Server 5.4") to read "...Server 5", and change the 4th line ("1,2,3,4,5,6") to read just "1"
Now, create a soft link to the files in the actual DVD image, as in: cd /rhel54; for f in ../rhel54dvd/*; do ln -s $f; done
Create the SRT using bmrsrtadm. Create a new SRT with option 1. Provide a name and description; enter the version number (5) and architecture; provide a directory for the SRT.
When prompted for the media, point to the directory created in Step 4 (ie, /rhel54).
When prompted for the Third Party Products CD, point to the mounted image from Step 2.
When prompted for the NBU client media, point to the mounted image in Step 3. Run through the client install script as if you were installing the client (it is actually being installed into the SRT).
Install any required NetBackup client maintenance packs to bring the SRT NBU client to the same version as your BMR boot server and master server.
Stop and start the NetBackup daemons on the boot server.

Next, perform a full backup of your RedHat client, ensuring that the "Collect disaster recovery information for Bare Metal Restore" checkbox is enabled in the policy before you start the backup.

After the backup completes, navigate to the Bare Metal Restore"Bare Metal Restore Management" node in the NetBackup Admin Console, right-click on the client and select "Prepare to Restore..." Ensure the patched SRT is selected and click OK.

In my case my BMR client was also a vSphere VM. I found I had to modify the BMR parameters slightly in order to successfully restore to a VM. Specifically, I had to remove the vmxnet module entry from the scsiLoadOrder in the BMR .info file (located in /tftpboot/bmr on the boot server).

Friday, August 27, 2010

Fun with Powershell OnTap

What is PowerShell?
PowerShell is a command-line shell and scripting language provided by Microsoft which adds capabilities to its predecessor, cmd.exe. Commands that previously worked in cmd.exe will work with PowerShell. In addition PowerShell 2 provides 129 cmdlets sharing a similar verb-noun syntax. For example, the command to list the contents of a directory is Get-ChildItem:

PS C:\scripts> get-childitem

Directory: C:\scripts

Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 8/18/2010 10:50 AM 6446 fun_with_netapp.ps1
-a--- 8/17/2010 5:40 PM 87 netapp.ps1
-a--- 8/17/2010 6:14 PM 1376 netapp_credentials.enc.xml
-a--- 8/17/2010 9:37 PM 1376 share_credentials.enc.xml

Built-in aliases are available for this command, and others. In this case both "dir" and "ls" will work in the PowerShell.

PowerShell is an object-oriented scripting language with a C# type syntax. It has access to the .NET framework extending its capabilities. Preferences can be stored in profiles familiar to administrators running shell scripts on other platforms.

Extensive help is available using the Get-Help cmdlet on any cmdlet available, as well as tab completion allowing the scripter to toggle through available commands.

PowerShell ONTAP
PowerShell OnTap is a module that adds over 100 cmdlets exposing the API of NetApp filers. After importing the module several more cmdlets are available for managing a NetApp filer. After logging into a filer, the cmdlets may be used to create and remove aggregates, volumes, shares and much more. NetApp community support is strong, and PowerShell OnTap is now known as the Data OnTap PowerShell Toolkit.

Here are some examples using PowerShell and the Data OnTap Toolkit:

Adding the snap in makes the DataOnTap cmdlets available to Powershell
Import-Module DataOnTap

Logging into the NetApp
PS C:\scripts> Connect-NaController netapp1.lumenate.com -Credential $Credential

OntapiMajorVersion : 1
OntapiMinorVersion : 12
Protocol : HTTPS
Vfiler :
Name : netapp1.lumenate.com
Address : 172.16.8.160
Port : 443
Credentials : System.Net.NetworkCredential
ValidateIncoming : False
ValidateOutgoing : False
Trace : False

Creating a flexvol
PS C:\scripts> New-NaVol myvol aggr0 100g

ChecksumStyle : block
CloneChildren :
CloneParent :
ContainingAggregate : aggr0
DiskCount : 6
FilesTotal : 3112959
FilesUsed : 100
IsChecksumEnabled : True
IsInconsistent : False
IsSnaplock : False
IsUnrecoverable : False
MirrorStatus : unmirrored
Name : myvol
PercentageUsed : 0
PlexCount : 1
Plexes : {/aggr0/plex0}
QuotaInit : 0
RaidSize : 16
RaidStatus : raid_dp
Reserve : 0
ReserveRequired : 0
ReserveUsed : 0
Sis :
SizeAvailable : 85899231232
SizeTotal : 85899345920
SizeUsed : 114688
SnaplockType :
SpaceReserve : volume
SpaceReserveEnabled : True
State : online
Type : flex
Uuid : 3e242ae8-ac8e-11df-ac68-00a09802a2c0
IsUnrecoverableSpecified : True
PercentageUsedSpecified : True
SpaceReserveEnabledSpecified : True

Creating a CIFS share for the new volume
PS C:\scripts> Add-NaCifsShare myvol /vol/myvol

Caching :
Description :
DirUmask :
FileUmask :
Forcegroup :
IsSymlinkStrictSecurity :
IsVolOffline :
IsVscan :
IsVscanread :
IsWidelink :
Maxusers :
MountPoint : /vol/myvol
ShareName : myvol
Umask :
DirUmaskSpecified : False
FileUmaskSpecified : False
IsSymlinkStrictSecuritySpecified : False
IsVolOfflineSpecified : False
IsVscanSpecified : False
IsVscanreadSpecified : False
IsWidelinkSpecified : False
MaxusersSpecified : False
UmaskSpecified : False

After creating a CIFs volume, we can continue to automate the procedure by mapping the volume to our host in PowerShell
PS C:\scripts> $net = $(New-Object -ComObject WScript.Network)
PS C:\scripts> $net.MapNetworkDrive("Z:", "\\netapp1.lumenate.com\myvol", "false", $Username, $Password )

PowerShell supports functions allowing the administrator to write and schedule scripts automating many of the Windows and NetApp tasks.

Useful Links

Downloading PowerShell 2.0
http://technet.microsoft.com/en-us/scriptcenter/powershell.aspx

To run PowerShell, you will need the .NET framework
http://www.microsoft.com/downloads/details.aspx?FamilyID=9cfb2d51-5ff4-4491-b0e5-b386f32c0992&displaylang=en

Finally, to download the Data OnTap PowerShell Toolkit
http://communities.netapp.com/community/interfaces_and_tools/data_ontap_powershell_toolkit/data_ontap_powershell_toolkit_downloads
You will need a working NetApp Now account to access this link

The Data OnTap PowerShell Toolkit communtiy provides downloads, sample scripts and communtiy support
http://communities.netapp.com/community/interfaces_and_tools/data_ontap_powershell_toolkit

Monday, August 23, 2010

Bluearc and HNAS Dynamic Write Balancing

Starting with Bluearc OS version 6.1 the Bluearc Titan, Mercury and equivalent HDS solutions support WFS2. WFS2 is the latest file system and includes several enhancements over WFS1 including Dynamic Write Balancing or DWB. Dynamic Write Balancing improves performance by ensuring that write operations occur in parallel across multiple stripesets. Prior to the DWB feature, writes were performed to one stripeset or one system drive at a time.

WFS2 is now the recommended File System for all Bluearc \ HNAS implementations and in almost every case DWB should be enabled. However recently we ran into a situation where we needed to disable DWB in order to resolve a problem with a 100TB HNAS file system that was nearly full. Seeing that the file system would exhaust all of the available free space the customer had initiated a delete operation that would free up approximately 40TB of space. In Bluearc\HNAS environments the actual process of freeing deleted blocks and returning them to the free pool is performed by the background truncator. The idea is to minimize the impact delete operations have on production workloads and free up the client that requested the delete to perform other operations. In our case tuning the background-truncate-chunk-size parameter resulted in a maximum delete rate of about 1.5TB per hour (This is per node). As the delete operations were occurring 50 or more hosts continued to write to the file system, albeit very slowly since each write needed to wait for free space to become available and the free space was likely highly fragmented at this point.

In an effort to alleviate the problem we added an additional 20TB of space to the file system. To our surprise, performance increased only slightly. Walking through the architecture again we came to believe that the problem was related to DWB. Essentially DWB was attempting to distribute writes to all the stripes supporting the file system. The 20TB we added was completely free and writes occurred very quickly. The original stripesets were nearly full, with small fragments being freed up as the background delete operations continued; resulting in thrashing, waits and very poor performance. Testing the theory we disabled DWB and retried all of the operations. At this point performance increased dramatically.

As I said, in almost every case you should use WFS2 and leave DWB enabled, but it’s nice that you can turn it off if need be.

Wednesday, August 18, 2010

Expanding VMFS datastore

A customer called today and wanted to know if they could expand their VMFS datastore which lives on Hitachi HDP (Hitachi Dynamic Provisioning) LUNs. I know this is possible, but until I've done it myself I can't just say "sure, go for it" when it comes to their production platform. Fortunately at Lumenate, we have an extensive lab environment so this was pretty easy to knock out of the park.

First, see that we have a 5GB (4.75GB formatted) LUN in our environment (pictures cropped to save space).

I take the liberty of expanding the storage on the array. I did not capture this because it may be different for your storage vendor of choice. In my environment, vSphere picked up the LUN size change.

Select the datastore to expand, right click and go to properties.

Click the Increase button near the middle top of the dialog box.

Now we want to select the LUN that our datastore is on. In this case, LUN 0. Note the informational message at the bottom of the dialog box. "The datastore already occupies one or more extents on this device. Selecting one of these extents on the device will expand it in the datastore. Selecting anything else on the device will add a new extent to the datastore."

Select Next at the bottom to continue.

Note here that we have our 5GB primary partition which is formatted as VMFS, and we're going to use the adjacent free space to expand the VMFS volume.

If this is what you want, go ahead and select next.

We need to tell vSphere how large to expand the datastore.

Take a minute to verify our work. Here I checking that I still have the right 5GB datastore, and that I'm expanding to the full capacity of the presented LUN.

Selecting finish at this step commits the changes.

The expansion happens quickly in this case, and completed by the time I got the screen shot ready. You should see some messages like this in your Tasks view.

Now I want to see my expanded datastore in my storage view, and verify that it was grown vs. having an extent added to it.

Happy Expanding!

Monday, August 16, 2010

Commvault Exchange 2007 Backup Agent Considerations

I was testing backups of Exchange with Commvault 8 a while back and realized there was a little more to it than just running the standard install of the agent. My Commserve/Media agent was running Windows 2008 64bit as well as my Exchange 2007 VM. I installed the Commvault Exchange DB and the required Windows File System agent with ease and was backing up and recovering within minutes. I was ready to proceed with the installation of the Mailbox and Public Folder agents when I remembered, “These agents are 32 bit, so I need to run the server in mixed mode! Again, no big deal, I set the registry setting (http://documentation.commvault.

com/commvault/release_8_0_0/books_online_1/english_us/search/search.htm) and “Voila!”, the 32 bit binaries showed up. I realized at this point that these 32 bit agents would require the 32 bit Windows File System Agent as well. That’s 2 file system agent licenses required for a single VM. I thought, “What about Exchange Clusters?” Sifting through Commvault’s documentation, I found a diagram describing Exchange Agents required in an x64 cluster:

Hmmm….that’s a total of 6 Windows file system agents for a single two-node Clustered Exchange instance!

Running the slower 32-bit MAPI-based Exchange backups may be sufficient if you have a large enough backup window. If not, using the 64-bit Exchange database backups in conjunction with the Offline Recovery Tool for Exchange (for individual mailbox recoveries) would be the way to go. For larger backup environments, Commvault’s capacity based license would make this a moot point.

Update:

Rowan pointed out another way to utilize a combination of Exchange Database and Mailbox agents without needing to install both 32 and 64-bit file system agents on the Exchange 2007 server. The 32-bit mailbox and file system agent can be installed separately on a 32-bit off-host proxy media agent. This not only eliminates the need to install the agents in mixed-mode, but also takes the backup load off of the Exchange 2007 server. Thanks, Rowan!

Tuesday, August 3, 2010

Geeking Out with VAAI

***NOTE: This blog post was edited 10/4/2010 to incorporate changes suggested in the comments.

According to this public HDS announcement there are three primary advantages to the VMware vStorage API for Array Integration (VAAI) delivered with the 0893/B AMS 2000 firmware and vSphere v4.1:

Hardware-assisted Locking: Provides an alternative means to protecting the VMFS cluster file system’s meta data
Full Copy: Enables the storage arrays to make full copies of data within the array without the VMware vSphere host reading and writing the data
Block Zeroing: Enables storage arrays to zero out a large number of blocks to enhance the deployment of large-scale VMs.

Hardware-Assisted Locking
Hardware-assisted locking in the context of VAAI implies that metadata changes to VMFS will be applied serially and atomically to preserve file system integrity. According to this presentation by Ed Walsh of EMC, hardware-assisted locking implements the SCSI Compare-And-Swap (CAS) command. Computer science enthusiasts might enjoy the escapade into the scalability of compare and swap versus test and set methods; if you know a good reference for rusty CS majors, "I have a friend who’d like to know about it."

Prior to the hardware-assisted method, the vSphere host had to acquire a SCSI reservation on the VMFS’ LUN(s); that is, issue a SCSI command to acquire the reservation, potentially wait and retry the command (perhaps multiple times) until the reservation is acquired, issue the SCSI write command to modify the filesystem metadata, and then issue a SCSI command to release the reservation. In the meantime, other hosts may be contending for the same lock even though the metadata changes may be unrelated.

It’s understandable that the SCSI reservation lock mechanism could lead to potential slowdowns in VMFS metadata updates. Typically I would resolve these during the architecture phase by reducing the opportunity for VMFS metadata changes in large clusters (ie, thick provisioning of VMDKs, avoiding use of linked clones, limiting workloads to certain hosts, etc).

Now, vSphere 4.1 hosts can issue a single SCSI command and the array applies the writes atomically. Additionally, the CAS command applies at the block level (not the LUN level) making parallel VMFS updates possible.

Hardware-assisted locking is a huge benefit in meta-data intensive VMFS environments.

So the question arises, what if I have vSphere v4.0 and v4.1 hosts accessing the same VMFS file system? As I understand it, the AMS controller will fail an atomic write issued to a LUN with a SCSI reservation, and vSphere 4.1 will fall back to using SCSI reservations. It’s unclear to me at this point if the vSphere host would automatically attempt to use atomic writes again, or if it would “stick” with SCSI reservations. One can assume it’s probably going to be best practice to only use atomic writes when a cluster only consists of v4.1 hosts.

Full Copy
The actual implementation of “Full Copy” is less clear to me, though obviously the array offloads the copy with little ESX host intervention (ie, I/O read and write requests from the host). Here's a video I put together which demonstrates the full copy and block zeroing features of VAAI. (It may also demonstrate why I shouldn't try to roll my own online demos and/or that I shouldn't quit my day job...)

Block Zeroing
This feature implements the SCSI “write same” instruction, which one might presume writes the same data across a range of logical block addresses.

Prior to this feature, writing 60GB of zeroes required 60GB of writes. I think we can all agree… if the data can be expressed exactly in a single bit, why are we consuming prodigious amounts of I/O to achieve the desired result?

Users of Hitachi Dynamic Provisioning (HDP)... I had hoped the new firmware/vSphere integration would "zero out" blocks automatically when a VMDK is removed from disk so they can be reclaimed using the array’s Zero Page Reclaim (ZPR) function. Unfortunately, my testing shows ZPR does not reclaim space after VMDKs are deleted from VMFS, implying vSphere does not "block zero" the old VMDK. But do not fear, this process is still greatly improved with VAAI.

I used vmkfstools to create an “eagerzeroedthick” VMDK to consume most of the free space in VMFS (vmkfstools -c [size] -d eagerzeroedthick). The operation took 138 seconds (1.27GB/s) in vSphere 4.1. In vSphere 4.0, the exact same operation to the exact same VMFS took 855 seconds (6x longer). As a point of comparison, the dd command in the vSphere service console operated at approximately 44MB/s.

After creating the dummy VMDK, I removed it and performed a ZPR (technically, you can perform the ZPR without deleting the VMDK, but then you might forget the VMDK is there and risk maxing out your VMFS free space very quickly).

Getting the Benefit of VAAI on HDS AMS 2000 Arrays
There are three host group options (two, sir!) which must be set to take advantage of these features (make sure you use the “VMware” platform type). I did not have to reboot or rescan VMFS to get the new features working (though honestly I have no good way of testing the atomic writes).

Hardware-Assisted Locking – According to HDS' best practices guide for AMS2000 in vSphere 4.1 environments, no changes are required to use hardware-assisted locking. DO NOT use with firmware versions prior to 0893/B.
Unique Extended COPY Mode – This option controls the “Full Copy” copy mode
Unique Write Same Mode – This option enables the “Block Zeroing” mode

Other Tidbits
Finally, vSphere storage admins will notice that VMware vSphere 4.1 contains two additional columns in the host storage configuration view:

Storage I/O Control – based on the VI Client help description, enabling this makes ESX(i) monitor datastore latency and adjusts the I/O load to the datastore. It uses a “Congestion Threshold” as the upper limit of latency allowed before Storage I/O Control assigns importance to VMs based on their shares.
Hardware Acceleration – lists each datastore's support of VAAI as “unknown”, “Not supported”, or “Supported”

Monday, August 2, 2010

Join Lumenate on Twitter and Facebook !

Join Lumenate on Twitter and Facebook and you could be our next winner! We are sharing company news, highlighting product and partner features, running contests and much more. Follow Lumenate to get quick updates on the most up to date breaking news in the Storage industry. We are excited to have you as a follower, and we look forward to chatting with you.

Twitter: Follow us on Twitter!
Facebook: Like us on Facebook!

Adventures Installing Linux on a Thumper

The other day I had a need to do some testing which required RHEL 5.4 or CentOS 5.4 installed on an X86-64 server. The only available server in the lab was a Sun X4500 (Thumper). It was Sun’s answer to a high density storage server housing 48 SATA drives in a 4U form factor. Great I thought, lots of disk space for my testing. I went through the Linux install no problem. The installer selected /dev/sda for the installation. No problem, I clicked on through the menus. Upon reboot, I found myself back at the previously installed Solaris.

Digging a little deeper, I found that the disk drives are configured with 8 drives on 6 different PCI busses. The BIOS can only boot from 2 of the 48 available drives. Unfortunately, Sun designed the system such that the bootable drives are not enumerated first. The bootable drives are located on PCI Bus 6, targets 0 & 4. Since I was booted back into Solaris, I ran the format command to determine which drive was the boot drive. It was the 25th drive listed by format.

So I decided to do another Linux install and select /dev/sdy (#25 in the enumeration) as the install drive. I deselected all other drives in the list. Next, I selected the “Edit Boot Loader Options” check box. I saw that /dev/sda was still selected for the location to install the MBR. I selected the option to change the drive order and pushed /dev/sdy to the top of the list. Now the location to install the MBR was /dev/sdy. After the install completed, the Thumper successfully booted CentOS.

In the end, I wound up installing RHEL 5.4 to satisfy the software requirements for my test. I performed the same install process for RHEL 5.4 and it installed and booted successfully.