|
What is a RAID
Lets start with the basics. RAID Redundant Array of Independent
Discs. In the old days it also used to mean Redundant Array of
Inexpensive Discs. A RAID system is a collection of hard drives
joined together using a RAID level definition ( see level below).
There are many uses for RAID. First it can be used to stripe
drives together to give more overall access speed (level 0).
Second it can be used mirror drives (level 1). Third it can
be used to increase uptime of your overall storage by striping
drives together and then keeping parity data, if a drive should
fail the system keeps operating (level 5). Most people use RAID
level 5 for the uptime purposes and its ability to join together
16 drives, giving a large storage block. Read about RAID levels
below and see which one suits you best.
Hot Spares
A hot spare is a stand by drive assigned to an array or
assigned to a group of arrays (global spare). If a drive goes bad
in an array the hot spare will take over for failed drive
automatically and your array will not suffer a performance degradation.
Hot spares only make sense on levels 5, 5+0 , 0+5, 1+5 and 5+1.
Hot Swap
Hot swap is a term used to describe the condition
in which drives are attached to the RAID controller. You always
want hot swap drives so that if a drive goes bad it can be
replaced on the fly without incurring downtime.
Other features to avoid downtime
Other features of professional RAIDs include Hot
swap and redundant power supplies. Hot swap and redundant fans. In
some more expensive RAID systems we even have hot swap and
redundant RAID controllers.
RAID Levels
Configure and
price a RAID system
RAID 0
This is the simplest level of RAID, and it just involves striping.
Data redundancy is not even present in this level, so it is not
recommended for applications where data is critical. This level
offers the highest level of performance out of any single RAID
level. It also offers the lowest cost since no extra storage is
involved. At least 2 hard drives are required, preferably
identical, and the maximum depends on the RAID controller. None of
the space is wasted as long as the hard drives used are identical.
This level has become popular with the mainstream market for it's
relatively low cost and high performance gain. This level is good
for most people that don't need any data redundancy. There are
many SCSI and IDE/ATA implementations available. Finally, it's
important to note that if any of the hard drives in the array
fails, you lose everything.
Configure and
price a RAID system
RAID 1
This level is usually implemented as mirroring. Two identical
copies of data are stored on two drives. When one drive fails, the
other drive still has the data to keep the system going.
Rebuilding a lost drive is very simple since you still have the
second copy. This adds data redundancy to the system and provides
some safety from failures. Some implementations add an extra RAID
controller to increase the fault tolerance even more. It is ideal
for applications that use critical data. Even though the
performance benefits are not great, some might just be concerned
with preserving their data. The relative simplicity and low cost
of implementing this level has increased its popularity in
mainstream RAID controllers. Most RAID controllers nowadays
implement some form of RAID 1.
Configure and
price a RAID system
RAID 2
This level uses bit level striping with Hamming code ECC. The
technique used here is somewhat similar to striping with parity
but not really. The data is split at the bit level and spread over
a number of data and ECC disks. When data is written to the array,
the Hamming codes are calculated and written to the ECC disks.
When the data is read from the array, Hamming codes are used to
check whether errors have occurred since the data was written to
the array. Single bit errors can be detected and corrected
immediately. This is the only level that really deviates from the
RAID concepts talked about earlier. The complicated and expensive
RAID controller hardware needed and the minimum number of hard
drives required, is the reason this level is not used today.
Configure and
price a RAID system
RAID 3
This level uses byte level striping with dedicated parity. In
other words, data is striped across the array at the byte level
with one dedicated parity drive holding the redundancy
information. The idea behind this level is that striping the data
increasing performance and using dedicated parity takes care of
redundancy. 3 hard drives are required. 2 for striping, and 1 as
the dedicated parity drive. Although the performance is good, the
added parity does slow down writes. The parity information has to
be written to the parity drive whenever a write occurs. This
increased computation calls for a hardware controller, so software
implementations are not practical. RAID 3 is good for applications
that deal with large files since the stripe size is small.
Configure and
price a RAID system
RAID 4
This level is very similar to RAID 3. The only difference is that
it uses block level striping instead of byte level striping. The
advantage in that is that you can change the stripe size to suit
application needs. This level is often seen as a mix between RAID
3 and RAID 5, having the dedicated parity of RAID 3 and the block
level striping of RAID 5. Again, you'll probably need a hardware
RAID controller for this level. Also, the dedicated parity drive
continues to slow down performance in this level as well.
Configure and
price a RAID system
RAID 5
RAID 5 uses block level striping and distributed parity. This
level tries to remove the bottleneck of the dedicated parity
drive. With the use of a distributed parity algorithm, this level
writes the data and parity data across all the drives. Basically,
the blocks of data are used to create the parity blocks which are
then stored across the array. This removes the bottleneck of
writing to just one parity drive. However, the parity information
still has to be calculated and written whenever a write occurs, so
the slowdown involved with that still applies. The fault tolerance
is maintained by separating the parity information for a block
from the actual data block. This way when one drive goes, all the
data on that drive can be rebuilt from the data on the other
drives. Recovery is more complicated than usual because of the
distributed nature of the parity. Just as in RAID 4, the stripe
size can be changed to suit the needs of the application. Also,
using a hardware controller is probably the more practical
solution. RAID 5 is one of the most popular RAID levels being used
today. Many see it as the best combination of performance,
redundancy, and storage efficiency.
Configure and
price a RAID system
RAID 10 or 0+1
Combining Levels of RAID
The single RAID levels don't address every application
requirement that exist. So, to get more functionality, someone
thought of the idea of combining RAID levels. What if you can
combine two levels and get the advantages of both? Well that was
the motivation behind creating these new levels. The main benefit
of using multiple RAID levels is the increased performance.
Usually combining RAID levels means using a hardware RAID
controller. The increased level of complexity of these levels
means that software solutions are not practical. RAID 0 has the
best performance out of the single levels and it is the one most
commonly being combined. Not all combinations of RAID levels
exist. The most common combinations are RAID 0+1 and 1+0. The
difference between 0+1 and 1+0 might seem subtle, and sometimes
companies may use the terms interchangeably. However, the
difference lies in the amount of fault tolerance. Both these
levels require at least 4 hard drives to implement. Let's look at
RAID 0+1 first.
This combination uses RAID 0 for it's high performance and RAID
1 for it's high fault tolerance. I actually mentioned this level
when I talked about adding striping to mirroring. Let's say you
have 8 hard drives. You can split them into 2 arrays of 4 drives
each, and apply RAID 0 to each array. Now you have 2 striped
arrays. Then you would apply RAID 1 to the 2 striped arrays and
have one array mirrored on the other. If a hard drive in one
striped array fails, the entire array is lost. The other striped
array is left, but contains no fault tolerance if any of the
drives in it fail.
RAID 1+0 applies RAID 1 first then RAID 0 to the drives. To
apply RAID 1, you split the 8 drives into 4 sets of 2 drives each.
Now each set is mirrored and has duplicate information. To apply
RAID 0, you then stripe across the 4 sets. In essence, you have a
striped array across a number of mirrored sets. This combination
has better fault tolerance than RAID 0+1. As long as one drive in
a mirrored set is active, the array can still function. So
theoretically you can have up to half the drives fail before you
lose everything, as opposed to only two drives in RAID 0+1.
The popularity of RAID 0+1 and 1+0 stems from the fact that
it's relatively simple to implement while providing high
performance and good data redundancy. With the increased reduction
of hard drive prices, the 4 hard drive minimum isn't unreasonable
to the mainstream anymore. However, you still have the 50% waste
in storage space whenever you are dealing with mirroring.
Enterprise applications and servers are often willing to sacrifice
storage for increased performance and fault tolerance. Some other
combinations of RAID levels that are used include, RAID 0+3, 3+0,
0+5, 5+0, 1+5, and 5+1. These levels are often complicated to
implement and require expensive hardware. Not all of the
combinations I mentioned above are used
Configure and
price a RAID system
|