Monday, September 19, 2011

Recovering an NTFS Boot Sector in Symantec Storage Foundation for Windows

One evening, I received a call from a customer who ran into an issue with Storage Foundation for Windows. They ran into a SFW bug while trying to shrink volumes for a disk space recovery project. FYI, unlike UNIX, SF volumes in Windows need to be offline before shrinking the volume. Long story short, the shrink process ended up corrupting the NTFS boot record on a 1.3TB volume. Even though the vxprint output showed the volumes ENABLED and ACTIVE with the correct volume boundaries, the volume showed up RAW, not NTFS from both VEA console and DISKPART.


Monday, September 12, 2011

Integrating DB2 Advanced Copy Services with NetApp Snapshots

IBM provides integration with NetApp snapshots via their Advanced Copy Services (ACS). ACS is available for DB2 on release 9.5 and newer. ACS supports NAS or SAN attached NetApp subsystems when running under AIX. Only NAS attachment is supported under Linux. ACS allows the DBA to issue backup & restore directives from the DB2 environment without involving the storage administrator. These backup and restore directives manipulate snaphots for the appropriate volumes on the NetApp filer. One caveat to using ACS is the 2 snapshot (backup) limit if you do not own a full TSM license. A full TSM license can be very pricy. Due to the 2 snapshot limit, I decided to use a hybrid approach where ACS provides two daily backups for quick recovery in conjunction with the traditional file based backup for longer retention periods.

Setup of ACS is fairly easy in either a Linux or AIX environment. ACS uses RSH to communicate with the NetApp filer, so the RSH service must be enabled on the filer (options rsh.enable on). The database server can be included in the /etc/hosts.equiv file on the filer so that a password is not needed. ACS is intelligent enough to correlate the database and log file systems/volume groups on the server to the corresponding volumes/LUNs on the filer.

During testing under AIX, I found an issue with the script where the acscim program caused the script to fail. The acscim module is used to communicate to IBM storage such as the DS series and requires supporting libraries that were not available on my system. I commented out the acscim check and the setup script completed normally. The setup script needs to be executed as root, but from the database instance user's home directory. The database instance user in the example below is db2int2.

ACS is installed under the database instance user's home directory. Go into the ACS directory and edit the script to comment out the acscim binary check.

bash-3.00# cd /home/db2int2/sqllib/acs

bash-3.00# grep cim

checkbin ${INST_DIR}/acs/acscim

enableSUID ${INST_DIR}/acs/acscim

bash-3.00# vi

bash-3.00# grep cim

# checkbin ${INST_DIR}/acs/acscim

enableSUID ${INST_DIR}/acs/acscim

Execute the script to provide the necessary parameters to configure ACS. I chose defaults for most of the questions. The ACS_REPOSITORY needs to be set to desired directory path which will be created by the script. The COPYSERVICES_HARDWARE_TYPE is either NAS_NSERIES or SAN_NSERIES under AIX or NAS_NSERIES for Linux.

bash-3.00# pwd


bash-3.00# ./

checking /home/db2int2/sqllib/acs/acsnnas ...


checking /home/db2int2/sqllib/acs/acsnsan ...


Do you have a full TSM license to enable all features of TSM for ACS ?[y/n]


****** Profile parameters for section GLOBAL: ******

ACS_DIR [/home/db2int2/sqllib/acs ]

ACSD [57329 ] 57328


****** Profile parameters for section ACSD: ******

ACS_REPOSITORY [/home/db2int2/sqllib/acs/acsrepository ]

****** Profile parameters for section CLIENT: ******





****** Profile parameters for section STANDARD: ******





The profile has beeen successfully created.

Do you want to continue by specifying passwords for the defined devices? [y/n]


Please specify the passwords for the following profile sections:



Creating password file at /home/db2int2/sqllib/acs/shared/pwd.acsd.

A copy of this file needs to be available to all components that connect to acsd.

BKI1555I: Profile successfully created. Performing additional checks. Make sure to restart all ACS components to reload the profile.

After setup is complete, check to see if the daemons are configured to start in /etc/inittab. Note: acsnnas is for NetApp NAS volumes and acsnsan is for NetApp SAN volumes.

bash-3.00# grep acs /etc/inittab

ac00:2345:respawn /home/db2int2sqllib/acs/acsd

ac00:2345:respawn /home/db2int2sqllib/acs/acsnnas –D


ac00:2345:respawn /home/db2int2sqllib/acs/acsnsan –D

Check to see if the daemons are running:

bash-3.00# ps -ef | grep acs

root 12255442 6225980 0 16:25:07 pts/2 0:00 grep acs

db2int2 12451872 1 0 16:24:50 - 0:00 /home/db2int2/sqllib/acs/acsd

db2int2 12451873 1 0 16:26:35 - 0:00 /home/db2int2/sqllib/acs/acsnsan -D


Now that ACS is configured, we can perform snapshot backups and restores. As the database instance user execute the following commands to take backups, list backups or restore the database.

Execute the following to take an offline backup:

bash-3.00$ db2 backup db mydb use snapshot

You can specify the "online" parameter to take an online backup of the database:

bash-3.00$ db2 backup db mydb online use snapshot

To list the backups of the database as follows:

bash-3.00$ db2acsutil query

To restore the latest backup:

bash-3.00$ db2 restore db mydb use snapshot


Monday, September 5, 2011

"Unfortunately this has not been documented very well." Fun with VERITAS Cluster Server

So Justin and I are wrapping up a large refresh project for a client where we're moving them from an existing configuration running Oracle on Sun 6800s with VCS over to new M5000s.  As you'd expect this includes a migration to Solaris 10 as well as upgrades to VERITAS Volume Manager, File System, and NetBackup (that's Symantec Storage Foundation and NetBackup to some of you).

The application team went through their testing over the last month or so and we completed our VCS test matrix in preparation for cutover.  During the cutover, though, we noticed the following message in the alert log:

WARNING:Oracle instance running on a system with low open file descriptor
        limit. Tune your system to increase this limit to avoid
        severe performance degradation.

Thinking that we'd missed a resource control setting somewhere we went through the process of validating those settings.  Then, seeing that they looked correct, we asked the DBA to stop and restart the database manually only to find that the error message above didn't appear.  Using VCS to stop and start the database would generate this error every time, though.

We opened a case with Symantec and started to troubleshoot.  Thankfully we found that in VCS 5.1 SP1 Symantec added a file called vcsenv that hardcodes limits for CPU time, core file size, data segment size, file size, and the number of open file descriptors before we ran out of window for the cutover.

The location and contents of the file are shown below, including where we set the number of file descriptors to 8192.

bash-3.00# cd /opt/VRTSvcs/bin
bash-3.00# more vcsenv
# $Id: vcsenv,v 2.8 2010/09/30 05:45:29 ptyagi Exp $ #
# $Copyrights: Copyright (c) 2010 Symantec Corporation.
# All rights reserved.
# The Licensed Software and Documentation are deemed to be commercial
# computer software as defined in FAR 12.212 and subject to restricted
# rights as defined in FAR Section 52.227-19 "Commercial Computer
# Software - Restricted Rights" and DFARS 227.7202, "Rights in
# Commercial Computer Software or Commercial Computer Software
# Documentation", as applicable, and any successor regulations. Any use,
# modification, reproduction release, performance, display or disclosure
# of the Licensed Software and Documentation by the U.S. Government
# shall be solely in accordance with the terms of this Agreement.  $

# This is just a sample as to how you can specify various environment
# variables you need to set.  Uncomment/add/modify the values as per
# your requirement.

# Specify the default language in which you want to bring up the
# VCS agents.

# LANG=C; export LANG


#This is required for agents which use dynamic VCSAPI libraries

# Setting ulimit.
# Common For Linux, HP-UX, SunOS & AIX
ulimit -t unlimited     # CPU Time
ulimit -c unlimited     # Core File Size
ulimit -d unlimited     # Data Seg Size
ulimit -f unlimited     # File Size
ulimit -n 8192          # File Descriptor

if [ `uname` = "AIX" ];then
        export RT_GRQ