RAID - Redundant Array of Independent Disks

What is RAID?

RAID is a family of techniques for managing multiple disks to provide desirable cost, data availability and performance characteristics to host environments. It is an acronym that, in its' most current definition set by the RAID Advisory Board (RAD), stands for "Redundant Array of Independent Disks". When RAID was first introduced, the emphasis was placed on the fact that the technology used multiple disks to achieve desirable cost and performance (originally, the acronym stood for "Redundant Array of Inexpensive Disks"). Today, "Redundancy" is the key focus in RAID technology. Protecting data is what RAID is all about.

A Little RAID History
In 1988, Garth Gibson, Randy Katz and David Patterson, professors at the University of California Berkeley, published a paper entitled "A Case for Redundant Arrays of Inexpensive Disks - RAID". At that time it was predicted that with rapid improvements in software file systems and computer processing power, the slower moving disk technology would soon create what they referred to as "an I/O Crisis" (a bottleneck). Their basic idea of RAID was to combine multiple small, inexpensive disks into an array that outperforms a Single Large Expensive Drive (SLED). This array of disks would be arranged in such a way that they would appear to a computer as a single logical drive.

The bad news: Reliability. They calculated that the Mean Time To Fail (MTTF) of an array would be equal to that of a single disk drive divided by the number of disk drives in the array. With this increased potential for data loss from a single drive failure, they defined 6 different array architectures each providing disk fault tolerance and each having different characteristics in order to achieve maximum performance in different environments. An additional non-redundant architecture (RAID Level 0) was also defined. RAID level 0 would dramatically increase performance over a single drive, but would provide no data protection in case of a drive failure.

Now, to keep things in perspective let's look at the drives used in their research. They used a top-of-the-line IBM 3380 mainframe disk, a Fujitsu M2361A "Super Eagle" minicomputer disk and a Conner Peripherals CP 3100 personal computer disk. The IBM mainframe disk had a formatted data capacity of 7500 MB, 14-inch diameter disk platters, required an external power supply, cost approximately $100,000 and provided a data transfer rate of 3 MB/sec. The Fujitsu drive capacity was 600 MB, with 10.5-inch platters, weighed in at approximately 140 pounds, cost roughly $12,000 and had a data transfer rate of 2.5 MB/sec. The Conner personal computer drive (the 'inexpensive disks') had a capacity of 100 MB, with 3.5-inch disks, costs under $1000 and had a data transfer rate of 1 MB/sec.

In today's storage industry, the SLED is dead, and RAID arrays live on. Virtually, the same capacity drives are now available to mainframe and enterprise systems, as well as individual personal computers. The difference is the technology that is used to implement the RAID and the level of data protection that is required. From high-end, fully redundant, systems that provide protection against any single point (and in some cases, multiple points) of failure, to single drives in personal computers with advanced backup and recovery options, the focus in today's storage industry is data availability and integrity. Therefore, many RAID implementations often include fault tolerant components such as smart disk technology, enclosure and environment monitoring, support for multiple RAID level arrays, automatic rebuilding of a failed array, global and dedicated on-line spares, redundant components (controllers, power supplies and cooling devices), server clustering, failover schemes, as well as sophisticated software monitoring, reporting and notification programs. Current technologies used to implement RAID solutions include Fibre Channel (optical and copper), SCSI, iSCSI, Parallel ATA, Serial ATA, and Serial Attached SCSI. And often times, combinations of technologies are used.

The Basis of RAID
The basis of RAID is that an intelligent manager (a hardware RAID controller or RAID software) can manage an array of disk drives in such a way that data is protected in the event of a single drive failure. By mapping segments of data across an array of disk drives, the array appears as one logical drive unit. By generating parity bits and storing them on the array, data can be regenerated if a disk drive fails. The manner in which data is placed onto the array provides for increased performance in different data structure environments. Today's controllers, as well as drives, are much more sophisticated and typically incorporate fault tolerant monitoring features (SAF-TE) and user-friendly firmware and software for RAID management.

Data Striping and Stripe Sizes
Data striping is the foundation of RAID. Disk drives in a RAID group are partitioned into stripes, which may be as small as one kilobyte or as large as several megabytes. The stripe size is the amount of data that is written to one disk before moving on to the next disk in the array. Data is divided up by the RAID controller, according to the stripe size, and written across the drives. To maximize performance RAID arrays should be configured with stripe sizes that correspond to the average I/O request size and the number of drives in the array (stripe width). As the stripe size of an array is decreased, files are divided up into smaller pieces and distributed across more drives. As the stripe size of an array is increased, files are divided up into larger pieces and distributed across fewer drives. In a perfect world if all of your files were the same size, you would take the size of your I/O request and divide it by the number of drives in the array (less one for parity) to get your stripe size optimized for throughput.

Benefits of RAID

Data Protection and Availability
RAID protects your data in the unlikely event of a drive failure. If a disk drive fails in a properly designed RAID system, network clients are unaware of the incident and they continue on with their work as if nothing happened. The RAID system continues to perform read/write operations and if a hot spare is available it automatically becomes part of the array and data that was on the failed drive is automatically regenerated onto this new drive in the array.

Without a RAID system, if a disk drive fails you may suffer the following economic costs:

  • Employee downtime costs
  • Emergency service costs
  • Data restoration costs
  • Data re-entry costs
  • Employee downtime during data re-entry cost
 
  • Lost sales costs
  • Lost customer costs
  • Lost opportunity costs
  • Intangible costs due to work day disruption

Think of a RAID system as an insurance policy for your data.

Performance
A RAID system provides increased performance over other storage systems by distributing I/O load evenly across disks with no system management or application involvement.

Data striping balances the I/O load across all the disk drives in an array. With multi-user operating systems like Windows NT, Unix and Netware, that support overlapped disk I/O across multiple drives, data striping keeps all the drives in the array busy and provides for efficient use of storage resources. Striping provides higher performance because all drives are involved as much as possible.

Hardware and Software RAID

Hardware RAID
Hardware RAID is very efficient because it does not occupy host system memory or consume CPU cycles. It functions independent of the operating system. Hardware RAID is also highly fault tolerant because the array logic is based in hardware and software is not required to boot.

Software RAID
Software based RAID systems offer a lower cost but they utilize system resources. They occupy host memory, consume CPU cycles and are operating system dependent. Software is also required for the array to boot and some implementations require a separate boot drive not included in the array.

External RAID Subsystems and Internal Host Based RAID

Internal Host Based RAID
In this case the RAID controller resides in the host system. Host based RAID offers high performance because disks can be striped over multiple channels and transfer rates are increased.

External SCSI to SCSI RAID
In this case the RAID controller resides in the external enclosure along with the disk drives. The RAID system is attached to a SCSI or Fibre Channel host adapter in the host. External RAID systems can easily be transferred to another host in the event of a host failure.

RAID levels

RAID levels can be classified according to their method of handling data. Here are just a few.

 

Description

Redundancy

Performance

Disk Utilization

Level 0
Data Striping
By definition RAID level 0 is not RAID because it does not provide data redundancy.  Data is striped at the block level across all drives without parity.
None.  If a drive fails all data is lost.
Highest performance because there is no parity related overhead.
100%
Level 1
Disk Mirroring
Data is written at the file level to a primary disk and a secondary disk.  Identical data is stored on both disks.
A mirrored set of drives is created.  If a drive fails data is still available.
High performance in read intensive applications.  If one drive is busy data can be accessed from the secondary disk.  Medium write performance.
50%
Level 1+0
Data Striping w/ Mirroring
Also referred to as RAID level 10, RAID 1+0 is a combination of RAID levels 0 and 1 by striping data across multiple mirrored pairs of disk drives.
Data is striped over mirrored sets.  Can sustain loss of more than one disk as long as they are not in same mirrored set.
High performance because there is no parity related overhead data is striped.
50%
Level 3
Data Striping w/ dedicated parity and Parallel Access
Data is segmented at a byte level across all disk drives of a RAID set, with one drive dedicated for parity.  Data is accessed in parallel.
One drive is dedicated for parity.  Data is regenerated in the event of a drive failure.
High read performance in data intensive applications because data is accessed in parallel.  High transfer rates/low transaction rates.  Poor write performance in multi-user environments.  Write operations cannot be overlapped.
(N-1)/N     (N= # of disks)
Level 4
Data Striping w/ dedicated parity and Independent Access
Data is striped at a block level across an array of disk drives with one drive dedicated to parity.  Data is accessed independently instead of in parallel.
One drive is dedicated for parity.  Data is regenerated in the event of a drive failure.
High read performance in transaction intensive applications that require high read requests because data is accessed independently.  Poor write performance in multi-user environments because write operations cannot be overlapped.
(N-1)/N     (N= # of disks)
Level 5
Data Striping w/ distributed parity and Independent Access
Data is striped across a group of disk drives with distributed parity.  Parity information is written to a different disk in the array for each stripe.
Parity is distributed across the disks in the array.  Data is regenerated in the event of a drive failure.
High read performance in multiprocessing environments because there is no contention for the parity disk and I/O operations can be overlapped
(N-1)/N     (N= # of disks)

Which RAID Levels are right for my application?

RAID Level 0 is the fastest and most cost-effective array type but provides no data protection. Reliability is actually less than that of a single disk drive. RAID Level 0 is good for high speed streaming of large file reads of non-critical data.

RAID Level 1 and 1+0 are good choices in environments where performance and data protection are more important than cost. RAID Levels 1 and 1+0 are good for hosting an operating system, host applications, and high random write transaction databases.

RAID Level 3 and 4 is suitable for data intensive environments where large blocks of data are being accessed sequentially. It has faster write performance than RAID 5, but does not allow multiple simultaneous write operations so it is not suitable for multi-user environments.

RAID Level 5 is best suited for multi-user I/O intensive environments where large amounts of small concurrent requests are being performed. RAID Level 5 is the most commonly used RAID level today especially in network environments where a performance difference between striping and mirroring cannot be detected.

RAID Terminology

Hot Swappable Disk Drives
This feature provides the ability to perform a drive replacement while the system is running and on-line. A disk drive can be disconnected and removed or replaced without bringing the system down. This is needed for the system to perform a data rebuild on-line in the event of a drive failure and maintain 24-hour data availability.

Hot Spare Disks
These are disk drives that are dedicated as spares for a RAID set and in the event of a disk failure they automatically become on-line and replace the failed disk. Data is automatically regenerated.

Global Hot Spare Disks
These are disk drives that are available as spares for multiple RAID sets and in the event of a disk failure from any RAID set, they automatically become on-line and replace the failed disk. Data is automatically regenerated.

Data Regeneration
When a disk drive in an array fails and is replaced, the RAID controller regenerates the data that was on the failed disk by utilizing the parity bits to calculate the lost data and write it to a replacement drive. During the time between when the disk failed and the new one is regenerated the controller continues to read data that was on the failed disk by utilizing the parity to calculate and respond to the read request. Also during this time the RAID controller performs write operations to the failed drive by temporarily writing the information to another drive and when the regeneration is complete it then transfers the information to the new drive.

Automatic Rebuild
Data is automatically regenerated to a hot spare disk in the event of a disk failure.

Variable Stripe Size
The size of the stripe partition on the drives in an array can be user defined to optimize performance for a particular application.

Read Ahead Cache
A caching strategy in which the RAID controller anticipates the next request for data and instructs the drives to read the data adjacent to the requested data and put it in cache. When that data is requested it is read from cache and performance is improved. This is recommended for data that is read sequentially and not randomly.

Write Back Cache
A caching scheme in which the RAID controller sends a completion status to the host operating system as soon as the cache receives the data during a write operation. The target drive will receive the data at a more convenient time in order to improve performance. If a power outage occurs, all data in cache is lost unless a UPS or battery backup module on the RAID controller is in use.

Write Through Cache
A caching scheme in which data is written to the disk drive before a completion status is sent to the host operating system. This is a lower performing caching scheme than Write-Back