Cache Architecture - Part 1
The last area we want to cover is cache. Cache or Cache Memory is just that – memory / DIMMs installed in an array to serve as a high speed buffer between the disks and the hosts. Most storage arrays are what are referred to as cache centric, meaning that all reads and writes are done through cache not directly to disk. In addition to user data / host IO, cache can be used to store configuration information and tables for snapshots, replication or other advanced features that need a high speed storage location. The data in cache must be protected and this is most commonly done with mirroring. In some cases all of the data is mirrored, in others the write IOs are mirrored while reads are not, since the read IOs already exist on the disk.
A common question is “in an array with 16GB of cache how much is really available for user / host IO?”
The exact details depend on the array and the configuration you have but the following concepts should be fairly constant. For example I am using an HDS AMS 2000 Series array.
· 16GB of Cache (8GB per controller)
· A percentage of cache is dedicated to the system area. This varies depending on the hardware configuration and whether or not features that use cache such as replication or Copy on Write Snapshot are enabled. Assuming that replication and Copy on write are not in use 3,370MB total or 1,452MB per controller will be dedicated to the system area leaving 13,480MB or 6,740MB per controller.
· Next each controller mirrors its cache to its partner, 13,480MB becomes 6,740MB or 3,370MB per controller.
· The last calculation depends on the type of IO. All arrays deploy some mechanism to keep the cache from being overrun with write IO requests. How this works is, a threshold is set that when met tells the array to begin throttling back incoming host write requests. In the case of the AMS 2000 series that threshold is 70%. Note that this is for write IO not reads. In a worst case scenario when performing 100% writes the available cache is limited to 70% of the 6,740MB number - 4,718MB total or 2,359MB per controller.
Looking at these numbers many are initially surprised by how little cache is actually available for user IO. It’s interesting to note that we rarely have cache related performance issues with these arrays. The reason has to do with the way that cache operates in a modern storage array. The following diagram created by our CTO and illustrates the relationship between cache and the physical disk drives.
Cache to Disk Relationship
The cache is wide, it provides a lot of IOPS but it is shallow – there isn’t much capacity. The disks are deep, they hold a lot of capacity but individually they aren’t particularly fast. What seems to be impacting the relationship most significantly is wide striping. Wide Striping allows you to pool multiple array groups and more effectively distribute IO across more disks. The result is that writes in cache are flushed to disk more quickly, keeping cache available for incoming IOs. Referring back to our funnel diagram we are essentially widening the bottom of the funnel. Wide striping is not unique to HDS, it is a common feature on many arrays and it just one example of how storage vendors attempt to balance an array. In our next entry we will take a look at the role of cache with various workloads.
Hi Terry,
ReplyDeleteI want to thank you for the long awaited 5th part. It is realy well-thought-out. Well done! :)
regards
Andreas Brunner
Thank you for the kind words, I'm glad you liked it!
ReplyDelete