Monday, August 23, 2010

Bluearc and HNAS Dynamic Write Balancing

Starting with Bluearc OS version 6.1 the Bluearc Titan, Mercury and equivalent HDS solutions support WFS2. WFS2 is the latest file system and includes several enhancements over WFS1 including Dynamic Write Balancing or DWB. Dynamic Write Balancing improves performance by ensuring that write operations occur in parallel across multiple stripesets. Prior to the DWB feature, writes were performed to one stripeset or one system drive at a time.

WFS2 is now the recommended File System for all Bluearc \ HNAS implementations and in almost every case DWB should be enabled. However recently we ran into a situation where we needed to disable DWB in order to resolve a problem with a 100TB HNAS file system that was nearly full. Seeing that the file system would exhaust all of the available free space the customer had initiated a delete operation that would free up approximately 40TB of space. In Bluearc\HNAS environments the actual process of freeing deleted blocks and returning them to the free pool is performed by the background truncator. The idea is to minimize the impact delete operations have on production workloads and free up the client that requested the delete to perform other operations. In our case tuning the background-truncate-chunk-size parameter resulted in a maximum delete rate of about 1.5TB per hour (This is per node). As the delete operations were occurring 50 or more hosts continued to write to the file system, albeit very slowly since each write needed to wait for free space to become available and the free space was likely highly fragmented at this point.

In an effort to alleviate the problem we added an additional 20TB of space to the file system. To our surprise, performance increased only slightly. Walking through the architecture again we came to believe that the problem was related to DWB. Essentially DWB was attempting to distribute writes to all the stripes supporting the file system. The 20TB we added was completely free and writes occurred very quickly. The original stripesets were nearly full, with small fragments being freed up as the background delete operations continued; resulting in thrashing, waits and very poor performance. Testing the theory we disabled DWB and retried all of the operations. At this point performance increased dramatically.

As I said, in almost every case you should use WFS2 and leave DWB enabled, but it’s nice that you can turn it off if need be.
Share/Save/Bookmark

2 comments:

  1. can you turn it off live with no disruption to current i/o?

    ReplyDelete
  2. Steve,
    Thanks for the question. Unfortunately you need to unmount the file system to change the DWB settings. However I would point out that turning off DWB is not something that should normally be required. In fact it should almost never be done. In our case it was necessary simply because of how we were using the storage and the fact that we had filled up the file system. Bluearc recently released version 7.0 of the SiliconFS that take the delete process from a sequential operation to one that is parallel. We are still waiting to see just how much faster deletes occur but in our case it may have prevented the issue altogether. I will be posting some of the new 7.0 features shortly.

    ReplyDelete