In the previous three entries on this topic we discussed several key storage performance concepts.
The Physical Disk Capabilities. Fibre Channel and SAS drives can handle more IOPS than SATA drives, making them a good choice for applications that generate a lot of random IO. From a MBPS standpoint SATA isn’t quite as fast as Fibre Channel and SAS but the delta is much smaller, making SATA acceptable for workloads that mainly generate sequential IO.
The common RAID implementations and their impact on storage performance. RAID 10 provides the best performance for random workloads. RAID 5 and 6 provide good performance for sequential workloads in some cases RAID 5 may actually be faster than RAID 10 – although this isn’t the norm.
The workload, the mix of; random, sequential, reads and writes has a major impact on performance, with writes putting the biggest load on disk drives.
We also showed how this formula could be used to determine the number of array groups that would be needed to meet an applications IOPS based on the disk drives and RAID level you choose.
(TOTAL IOps × % READ) + ((TOTAL IOps × % WRITE) ×RAID Penalty)
We left off pointing out that while this information is valuable it leaves off some of the challenges we face when architecting solutions in the real world. Two factors we have not considered are capacity and cost. The majority of the time we start building our solution based on the capacity requirements.
For example, an organization might need 10TB of capacity to support a new application with a random workload consisting of 75% reads and 25% writes with a peak IOPS load of 2,500. The capacity will be added to an existing array that supports both SAS and SATA drives.
Since this is a random workload we will be recommending SAS drives but aren’t yet sure if this needs to be a RAID 10 or RAID 5 configuration. We could use RAID 6 but since we will be using 450GB drives and our array has multiple hot spares we think that RAID 5 will provide suitable protection for the data.
First we will look at the capacity requirements for each RAID level.
Capacity | RAID 5 | RAID 10 |
RAID Group Size | 8+1 | 4+4 |
Usable Capacity | 3,600 | 1,800 |
Required RAID Groups | 3 | 6 |
Total Capacity | 10,800 | 10,800 |
Using 450GB LUNs we need twice as many RAID 10 groups as RAID 5 groups to meet the capacity requirements.
Now we will take a look at the cost. We are using the same size drives for each configuration so the cost per drive is constant but we need almost twice as many drives for the RAID 10 configuration. In addition the number of drives in the RAID 10 configuration will require additional drive trays to be added to the array. In our case we are assuming that the drives are $1,500 a piece and that each tray holds 15 drives at a cost of $10,000 per tray.
Cost | RAID 5 | RAID 10 |
RAID Group Size | 8+1 | 4+4 |
Required RAID Groups | 3 | 6 |
Total Disks Required | 27 | 48 |
Trays Required | 2 | 4 |
Cost of disks | $40,500 | $72,000 |
Cost or trays | $20,000 | $40,000 |
Total Cost | $60,500 | $112,000 |
As you would expect the cost of RAID 10 is almost twice as high as RAID 5. What may not be as obvious are the performance differences between the two configurations. In the past we focused on comparing a single RAID group of each type, keeping the number of drives constant. In this case two things have changed.
1. I’m using an 8+1 array group rather than a 7+1. 8+1 is the RAID 5 configuration recommended by the manufacturer because of the way it aligns with the caching mechanisms of the array. In addition an 8+1 provides sufficient availability and rebuild times while making better use of the raw space.
2. In this real world configuration I have twice as many RAID 10 groups and therefore a lot more disk, hence raw IOPS.
Using 185 IOPS per drive we find that the two configurations have the following characteristics.
Performance | RAID 5 | RAID 10 |
IOPS Per Drive | 185 | 185 |
# of Drives | 27 | 48 |
Raw IOPS | 4995 | 8880 |
We can now use our formula to determine if either solution will meet our requirement.
Performance | RAID 5 | RAID 10 |
Required IOPS | 2500 | 2500 |
Percent Read | 75% | 75% |
Percent Write | 25% | 25% |
RAID Penalty | 4 | 2 |
Adjusted IOPS | 4375 | 3125 |
In our example both RAID 5 and RAID 10 meet the performance requirement. Although the RAID 5 configuration is tighter, it has a reasonable amount of headroom. Given the major difference in cost it is probably reasonable to proceed with the RAID 5 configuration.
Looking at our results you may say “Well the RAID 5 configuration may work but wouldn’t the RAID 10 design be a lot faster?” Well, not necessarily. If the speed limit is 55 and you must drive the speed limit a Ford F150 and a Ferrari will both get you there in the same amount of time. This is the same with storage, just because one configuration could run faster doesn’t mean it will – you have to be able to drive higher IOPS from the host.
The area that we will explore in our next entry is cache. While cache improves performance in general it is particularly beneficial for parity based RAID configurations.
Well done.
ReplyDeleteI can't wait to read the 5th entry regarding cache.
regards
Andreas
Thanks Andreas. I'm tied up on some other projects right now but I promise to get it posted soon.
ReplyDelete