Storage Meat: Backup Performance Issues

When you’re experiencing performance issues with your backups you might think you need a faster tape drive or perhaps more tape drives. This could be the case but as often as not the tape drive isn’t the issue. Here are the performance characteristics of some common tape drives.

Technology	LTO 2	LTO 3	LTO 4	LTO 5	T10000B
Native MBps	40	80	120	140	120

Note that these are the native numbers and don’t include compression. So how do you know if the tape drive is the issue?

1. Do your backups perform at a rate near the native capabilities of the tape technology? If not chances are that the tape drive isn’t the problem – for whatever reason you’re not getting the data to the drive fast enough.

2. Are some backups really fast and others really slow? Again this indicates that certain backup jobs aren’t sending data fast enough.

In these cases implementing a faster tape drive is unlikely to result in any significant performance improvement, because it’s not the bottleneck. So, what could the problem be? Here are some things to consider.

A single Gigabit Ethernet link can only drive a single tape drive. In the case of LTO4 or greater it won’t even do that. If you are doing network based backups make sure you have enough network bandwidth to drive the tape drives. You may need to add network interfaces to the backup server or move to 10Gb Ethernet.

The client being backed up makes a big difference in terms of performance. You need to understand the client’s hardware, the network interfaces, characteristics of the hard drives and the processing load during backups. If a host is performing poorly during backups you may want to take a look at the utilization and statistics from these components to determine if one or more might be causing the issue. Is the client’s hard drive too slow, is the processor pegged during backups, is it connecting at 100Mb instead of 1Gb?

The data being backed is a significant factor in the performance you are able to achieve. Large files such as those associated with databases will backup faster than small files. This is due in part because the backup software has to catalog each file being backed up. The more small files you have the slower the backups will run. Most backup applications have a block based client or option that bypasses the file system and minimizes the impact associated with high file counts. This is not a perfect solution, typically this client will backup the whole hard drive, free space and all. The client still has to capture the file system metadata so that individual restores can be performed and this results in a delay before backing up the data actually begins. This process also impacts restore operations since the files are not necessarily stored contiguously at the backup target. Even with these limitations this is often the best option but it should be tested using an evaluation license before purchasing it.

Fragmentation can have a serious impact on performance. A system’s hard drive is the slowest component in the IO path. When a drive is seriously fragmented, say 4 or 5 fragments per file, the hard drive will spend a lot of time seeking out the individual pieces of each file and dramatically decrease performance. If the files are small and there are millions of them the effect is multiplied. Check the fragmentation level of hosts with a lot of files to ensure this isn’t an issue. It may also be worthwhile to invest in a 3rd party defrag tool.

All backup solutions have configuration settings on the servers and clients that impact performance. These are not one size fits all and can be tweaked to provide the best performance in your environment. Consult the vendor documentation on the configuration settings and test them out on a host with known performance characteristics. We recommend testing this with full backups not incremental so that you can get a more accurate picture of how the changes are impacting performance.

Once you have reviewed the environment thoroughly, made corrections where possible and tuned the environment you still may not get the performance you need. At this point you have several options.

1. You can backup less data using a number of techniques.

Archiving solutions allow you to migrate off infrequently accessed files and take them out of the nightly backup rotation. If you can take a host with 2 million files down to 500 thousand your backup times will improve greatly.
Synthetic backups. The concept behind synthetic backups is that you backup an initial full to disk, then incremental backups from then on. New full backups are created by combining the previous full backups on disk with the latest incremental backup.
Source based deduplication. Source based deduplication minimizes the data that must be backed up from each host by comparing the blocks of data that have already been backed up. If a file or part of a file already exists it is not backed up again but rather a pointer is inserted in its place to where the data already exists in the backup repository.

2. You can also speed up the backup process using multiple techniques.

Disk based backups. Adding a disk backup target has a number of benefits. Disk is more reliable than tape, it allows for faster restores and it allows more backups to be performed in parallel. The disk component can be added as part of a SAN, direct attached, as a backup appliance or VTL. Appliances and VTL solutions provide additional functionality such as deduplication, replication and the ability to write directly to tape without involving the backup server.
LANless backups. In this scenario you take your hosts with large amounts of data and replace the network backup client with a client that allows them to directly access the backup target, either disk or tape. In most cases it only backs up itself but if you have the appropriate licensing it could back up other servers as well. Know that this will only work if the network is the bottleneck. The host will need access to the backup target so an HBA will need to be added to the client and the tape library must be attached to the SAN. Additional licenses may be required from the backup application vendor to enable this functionality.
Serverless backups. This name is a bit confusing because in the vast majority of cases a server is still involved. In this scenario you make point in time copy of the data you want to backup, mount it to a backup server and perform the backup from the backup server to disk or tape. The point in time copy can be a snapshot or a full clone of the original data. In order to do this you need the following components.

Shared Storage. The Backup server must have access to the point in time copy.
A method to create the point in time copy. 99% of the time this is done with software built into the shared storage array.
The backup server must be able to read the backup clients volumes and file systems. Unless you are using something like Symantec’s Storage Foundation this means that the backup server and client must be running the same operating system.
You must be able to quiesce the application associated with the data you want to backup. Quiescing the application ensures that datafiles, indexes and logs are all in sync allowing you to perform a clean restore without the need to perform a roll back or other recovery processes.

I have tremendous respect for those responsible for backups. Backups are complicated and require a lot of attention. When backups or restores go well no one seems to notice, but the first time you are unable to perform a restore or it takes longer than expected everyone is up in arms. As with everything in IT there are multiple ways to do things and if you truly understand how the various options work you can create a solution to meet just about any need. The other nice thing is that in most cases you can test out your solution without a lot of investment. It takes a lot of work upfront but it’s better than spending a lot of money on a solution that doesn’t meet your needs.

Monday, October 11, 2010

Backup Performance Issues

No comments:

Post a Comment