RAID systems are categorised by RAID levels. Levels 0 to 5 are known as the
Berkeley RAID levels from their origin. Other levels have been invented by
various people. Here is a summary of RAID levels in use.
- RAID 0 - stripes data across two (or more) disks. This gives
better read and write bandwidth for bulk transfers, and under heavy load
(multiple threads) it will provide better random access and small file
access.
- RAID 1 - mirrors data across two disks. This gives better read
performance (doubles the speed under heavy load) and means that a failure of a
single disk will not result in data loss.
This can be implemented with more than two disks for either better read
performance under really heavy load, or for great paranoia. I have never seen
or heard of such a configuration being used, but I'm sure it's out there
somewhere.
Some implementations of RAID 1 such as in IBM's AIX may read from both
disks for every access (this won't allow any performance increase). I wonder
what they would do if both disks were different...
- RAID 2 and RAID 3 - I have not heard of these being
implemented. They seem to only be used in computer science text books.
- RAID 4 - this involves having two or more data disks over which the
data is striped in a similar fashion to RAID 0. Then there is a single
parity disk. Usually each block on the parity disks contains an XOR of the
data on the same block on each of the data disks, the parity disk could use
other parity algorithms - a checksum could be used just as well (but I will
refer to XOR throughout this document).
If a disk dies then it can be reconstructed from the XOR of the other disks.
Reading data is the same as reading from RAID 0 when all disks are
functional. If one of the data disks is broken then the RAID system will read
from all the other data disks and the parity disk and return the XOR of this
data (which will be the same as the data that had been written to the lost
disk).
Writing data requires changing the parity disk too. To do this we have to read
the original data from the block that is to be written, and the parity data
for that block. The new parity block will be the XOR of the old parity block,
the old data block, and the new data block. This means that write performance
is poor (worse than a single non-RAID disk).
- RAID 5 - the same as RAID 4 but the parity is spread across
all the disks. This makes it significantly faster than RAID 4, and there
is no down-side. So RAID 4 is almost never used. If a RAID 5
array is used for heavy writes then I expect the performance to be less than
2/N times the performance of a single disk (where N is the number
of data disks). The write performance of a 3 disk array (2 data disks) in my
tests is less than the performance of a single disk, I haven't had an
opportunity to test other array sizes.
This is the highest RAID level defined by the Berkeley RAID.
- RAID 0+1 / RAID 1+0 / RAID 10 - this means running RAID 1
over some RAID 0 stripes. This gives mirroring for reliability
and read bandwidth, and striping for write bandwidth, capacity, and read
bandwidth under heavy load.
Mylex refers to this as RAID 6
Copyright © 2001 Russell
Coker, may be distributed freely.