Monday, December 6, 2010

Storage Performance Concepts Entry 4 - The Real World

In the previous three entries on this topic we discussed several key storage performance concepts.

The Physical Disk Capabilities. Fibre Channel and SAS drives can handle more IOPS than SATA drives, making them a good choice for applications that generate a lot of random IO. From a MBPS standpoint SATA isn’t quite as fast as Fibre Channel and SAS but the delta is much smaller, making SATA acceptable for workloads that mainly generate sequential IO.

The common RAID implementations and their impact on storage performance. RAID 10 provides the best performance for random workloads. RAID 5 and 6 provide good performance for sequential workloads in some cases RAID 5 may actually be faster than RAID 10 – although this isn’t the norm.

The workload, the mix of; random, sequential, reads and writes has a major impact on performance, with writes putting the biggest load on disk drives.

We also showed how this formula could be used to determine the number of array groups that would be needed to meet an applications IOPS based on the disk drives and RAID level you choose.

(TOTAL IOps × % READ) + ((TOTAL IOps × % WRITE) ×RAID Penalty)

We left off pointing out that while this information is valuable it leaves off some of the challenges we face when architecting solutions in the real world. Two factors we have not considered are capacity and cost. The majority of the time we start building our solution based on the capacity requirements.

For example, an organization might need 10TB of capacity to support a new application with a random workload consisting of 75% reads and 25% writes with a peak IOPS load of 2,500. The capacity will be added to an existing array that supports both SAS and SATA drives.

Since this is a random workload we will be recommending SAS drives but aren’t yet sure if this needs to be a RAID 10 or RAID 5 configuration. We could use RAID 6 but since we will be using 450GB drives and our array has multiple hot spares we think that RAID 5 will provide suitable protection for the data.

First we will look at the capacity requirements for each RAID level.

Capacity

RAID 5

RAID 10

RAID Group Size

8+1

4+4

Usable Capacity

3,600

1,800

Required RAID Groups

3

6

Total Capacity

10,800

10,800

Using 450GB LUNs we need twice as many RAID 10 groups as RAID 5 groups to meet the capacity requirements.

Now we will take a look at the cost. We are using the same size drives for each configuration so the cost per drive is constant but we need almost twice as many drives for the RAID 10 configuration. In addition the number of drives in the RAID 10 configuration will require additional drive trays to be added to the array. In our case we are assuming that the drives are $1,500 a piece and that each tray holds 15 drives at a cost of $10,000 per tray.

Cost

RAID 5

RAID 10

RAID Group Size

8+1

4+4

Required RAID Groups

3

6

Total Disks Required

27

48

Trays Required

2

4

Cost of disks

$40,500

$72,000

Cost or trays

$20,000

$40,000

Total Cost

$60,500

$112,000

As you would expect the cost of RAID 10 is almost twice as high as RAID 5. What may not be as obvious are the performance differences between the two configurations. In the past we focused on comparing a single RAID group of each type, keeping the number of drives constant. In this case two things have changed.

1. I’m using an 8+1 array group rather than a 7+1. 8+1 is the RAID 5 configuration recommended by the manufacturer because of the way it aligns with the caching mechanisms of the array. In addition an 8+1 provides sufficient availability and rebuild times while making better use of the raw space.

2. In this real world configuration I have twice as many RAID 10 groups and therefore a lot more disk, hence raw IOPS.

Using 185 IOPS per drive we find that the two configurations have the following characteristics.

Performance

RAID 5

RAID 10

IOPS Per Drive

185

185

# of Drives

27

48

Raw IOPS

4995

8880

We can now use our formula to determine if either solution will meet our requirement.

Performance

RAID 5

RAID 10

Required IOPS

2500

2500

Percent Read

75%

75%

Percent Write

25%

25%

RAID Penalty

4

2

Adjusted IOPS

4375

3125

In our example both RAID 5 and RAID 10 meet the performance requirement. Although the RAID 5 configuration is tighter, it has a reasonable amount of headroom. Given the major difference in cost it is probably reasonable to proceed with the RAID 5 configuration.

Looking at our results you may say “Well the RAID 5 configuration may work but wouldn’t the RAID 10 design be a lot faster?” Well, not necessarily. If the speed limit is 55 and you must drive the speed limit a Ford F150 and a Ferrari will both get you there in the same amount of time. This is the same with storage, just because one configuration could run faster doesn’t mean it will – you have to be able to drive higher IOPS from the host.

The area that we will explore in our next entry is cache. While cache improves performance in general it is particularly beneficial for parity based RAID configurations.


Share/Save/Bookmark

2 comments:

  1. Well done.

    I can't wait to read the 5th entry regarding cache.

    regards

    Andreas

    ReplyDelete
  2. Thanks Andreas. I'm tied up on some other projects right now but I promise to get it posted soon.

    ReplyDelete