Sunday, February 20, 2011

Solaris 10 dfstab bug

I'll admit, it's been awhile since I have had to present NFS shares on Solaris 10 that were not ZFS file systems. ZFS makes this pretty easy, but hey, it's not like it was terribly difficult to edit the /etc/dfs/dfstab file. But after a recent head banging experience where my dfstab file would get magically overwritten during reboots, and my shares commented out; I started to wonder if I really need to hang up this IT thing and open the donut shop.

It turns out that Oracle has a little bug in Solaris 10 (6941744) which basically says that libsharecore.c has some incorrect code that improperly affects Solaris 10 systems.

Based on this thread it doesn't sound like a fix is here yet, so I put this together quickly to get past the issue. While I have no reason to suspect this script will cause issues, I provide it to you with the understanding that you will ensure it is the right solution for your environment. I placed this script at /etc/init.d/local, and linked it to rc2.d and rc3.d as necessary.

# This is a cleanup script due to an issue in Solaris 10 update 9 and previous
# SUN/Oracle bug 6941744
# From /etc/dfs/dfstab
# --------------------
# Use the sharemgr(1m) command for all share management
# This file is reconstructed and only maintained for backward
# compatibility. Configuration lines could be lost.
# --------------------
# created by Justin Richards, Lumenate
# Version 1.0 Date 02202011

case "$1" in
$CAT /etc/dfs/dfstab | $SED '/^# Error: Syntax: /s/# Error: Syntax: //' > /etc/dfs/dfstab.tmp ;\
$CP /etc/dfs/dfstab.tmp /etc/dfs/dfstab ; $SHARE
echo "Usage: $0 {start|restart}"
exit 1


Thursday, February 17, 2011

That Looks Bad

Ever been caught in a situation like this (see more)? Last year we documented various recovery procedures for one of our clients to assist them with full system recovery without the assistance of Symantec’s Bare Metal Restore (BMR).

We took care to only document procedures that a systems administrator could follow without the aid of the NetBackup admins, so I told the admin, “hey, since you are the target audience for these procedures, I’ll just look over your shoulders as you follow them.”

Enter Mr. Murphy. Obviously, one critical step in the restore process is restoring the C:\ drive. When it came time to select the recovery point in time, the NetBackup Backup, Archive, and Restore (BAR) application pops up the following dialog:

“That’s funny. I don’t find that on my procedures,” said the admin with a grin. Yeah, I know.

I knew the server had good backups, so why was NetBackup giving me the stiff arm? I checked the policy configuration on NetBackup and realized NetBackup 7 no longer has an “MS-Windows-NT” policy, but “MS-Windows” instead.

Let’s face it, I understand why Symantec changed the name of this policy type. One could argue this change should’ve been made a long time ago, and it’s a great example of how seemingly innocent changes can have unintended consequences.

Since BAR can restore from any policy type, we attempted to configure it for “MS-Windows” instead of “MS-Windows-NT", but BAR knew nothing of “MS-Windows” because it was a NetBackup 6.5.x client.

Fortunately, there was a way out. We completed our restore procedures using server-directed restores (thanks to Frank for the great suggestion). I simply initiated the restore from the Java admin GUI. Now I’ve got some screen shots with which to update the documentation.

Here were the key components for a successful restore:
  • Install NetBackup to a drive other than C:\
  • Remove the system from a domain (if this applies) prior to attempting the System_State restore.
  • Add the system to DNS or ensure forward/reverse name lookups function
  • I found some invalid SERVER entries in the client’s SERVER list that prevented a successful restore.
  • Check available disk space before the restore.
  • Delete the Windows page file (or reduce it greatly) prior to restoring the system state.