Monday, November 22, 2010

Storage Performance Concepts Entry 2

Our first entry on storage performance focused on IOPS and MBps from a physical disk drive perspective. This entry will take a closer look at performance as it relates to various RAID levels. Since this entry is intended to be practical we are not going to focus on all of the RAID levels or proprietary approaches, but rather the most common options you might be considering as you design a storage solution. In general these are RAID 10, RAID 5 and RAID 6. As a recap, here is how these levels are defined as per the SNIA Dictionary.

RAID 5

A form of parity RAID in which the disks operate independently, the data stripe size is no smaller than the exported block size, and parity check data is distributed across the RAID array's disks.

RAID 6

Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods, including dual check data computations (parity and Reed Solomon), orthogonal dual parity check data and diagonal parity have been used to implement RAID Level 6.

RAID 10

RAID is referred to as a nested or hybrid approach and is therefore excluded from the SNIA Dictionary. Technically RAID 10 is just a combination of RAID 1 and RAID 0 – mirroring and then striping. Years ago I would hear debates about which arrays did RAID 10 versus those that did RAID 0+1 but I haven’t heard any grumblings in a while so we will stick with RAID 10 to keep it simple.

Putting disk drives into RAID groups does not fundamentally change how many IOPS they can handle, the raw IOPS capabilities of a drive is constant. The reason we experience differences in performance have to do with how the various RAID levels handle IO and in particular any additional IOs generated to protect the data.

· A random write operation to a RAID 10 configuration results in 2 disk IOs, one to each drive in the mirror. The same single write request to a RAID 5 set would generate 4 IOs; read the data, read the parity, write the data, write the parity. RAID 6 adds two additional parity operations to each write.

· A sequential write IO to RAID 10 works the same as a random write, one to each disk in the mirror. For RAID 5 and 6 sequential operations are a bit more efficient than random IO since there is no existing data or parity to be read; write the data, write the parity.

· Read Operations are a bit different since neither parity nor mirroring come into play, it’s simply a matter of how many drives can you read data from concurrently. Assuming that each RAID configuration has the same number of drives, RAID 10 is slightly faster than parity RAID.

Here is a chart that makes this a little clearer.

Note: The chart is based on sending 4 IOs to the RAID group. I could have used a single IO as the base for the comparison but this would not demonstrate the differences in sequential and random IO for the parity based RAID configurations.

RAID 10

RAID 5

RAID 6

Random Read

4

4

4

Random Write

8

16

24

Sequential Read

4

4

4

Sequential Write

8

5

6

As you can see the differences are pretty significant. By combining the information in our first entry about the performance capabilities of the various drive types with the characteristics of the most common RAID levels you can get an accurate picture of the overall backend performance that the system is capable of. Of course in practice workloads don’t typically fall neatly into these categories, but rather perform a combination of different IO types.

Understanding the workloads and how they impact performance is the subject of our next entry.


Share/Save/Bookmark

No comments:

Post a Comment