|
RAID - Redundant Array of Independent
Disks
What is RAID?
RAID is a family of techniques for managing multiple disks to
provide desirable cost, data availability and performance characteristics
to host environments. It is an acronym that, in its' most current
definition set by the RAID Advisory Board (RAD), stands for "Redundant
Array of Independent Disks". When RAID was first introduced,
the emphasis was placed on the fact that the technology used multiple
disks to achieve desirable cost and performance (originally, the
acronym stood for "Redundant Array of Inexpensive Disks").
Today, "Redundancy" is the key focus in RAID technology.
Protecting data is what RAID is all about.
In 1988, Garth Gibson, Randy Katz and David Patterson, professors
at the University of California Berkeley, published a paper entitled
"A Case for Redundant Arrays of Inexpensive Disks - RAID".
At that time it was predicted that with rapid improvements in
software file systems and computer processing power, the slower
moving disk technology would soon create what they referred to
as "an I/O Crisis" (a bottleneck). Their basic idea
of RAID was to combine multiple small, inexpensive disks into
an array that outperforms a Single Large Expensive Drive (SLED).
This array of disks would be arranged in such a way that they
would appear to a computer as a single logical drive.
The bad news: Reliability. They calculated that the Mean Time
To Fail (MTTF) of an array would be equal to that of a single
disk drive divided by the number of disk drives in the array.
With this increased potential for data loss from a single drive
failure, they defined 6 different array architectures each providing
disk fault tolerance and each having different characteristics
in order to achieve maximum performance in different environments.
An additional non-redundant architecture (RAID Level 0) was also
defined. RAID level 0 would dramatically increase performance
over a single drive, but would provide no data protection in case
of a drive failure.
Now, to keep things in perspective let's look at the drives used
in their research. They used a top-of-the-line IBM 3380 mainframe
disk, a Fujitsu M2361A "Super Eagle" minicomputer disk
and a Conner Peripherals CP 3100 personal computer disk. The IBM
mainframe disk had a formatted data capacity of 7500 MB, 14-inch
diameter disk platters, required an external power supply, cost
approximately $100,000 and provided a data transfer rate of 3
MB/sec. The Fujitsu drive capacity was 600 MB, with 10.5-inch
platters, weighed in at approximately 140 pounds, cost roughly
$12,000 and had a data transfer rate of 2.5 MB/sec. The Conner
personal computer drive (the 'inexpensive disks') had a capacity
of 100 MB, with 3.5-inch disks, costs under $1000 and had a data
transfer rate of 1 MB/sec.
In today's storage industry, the SLED is dead, and RAID arrays
live on. Virtually, the same capacity drives are now available
to mainframe and enterprise systems, as well as individual personal
computers. The difference is the technology that is used to implement
the RAID and the level of data protection that is required. From
high-end, fully redundant, systems that provide protection against
any single point (and in some cases, multiple points) of failure,
to single drives in personal computers with advanced backup and
recovery options, the focus in today's storage industry is data
availability and integrity. Therefore, many RAID implementations
often include fault tolerant components such as smart disk technology,
enclosure and environment monitoring, support for multiple RAID
level arrays, automatic rebuilding of a failed array, global and
dedicated on-line spares, redundant components (controllers, power
supplies and cooling devices), server clustering, failover schemes,
as well as sophisticated software monitoring, reporting and notification
programs. Current technologies used to implement RAID solutions
include Fibre Channel (optical and copper), SCSI, iSCSI, Parallel
ATA, Serial ATA, and Serial Attached SCSI. And often times, combinations
of technologies are used.
The basis of RAID is that an intelligent manager (a hardware RAID
controller or RAID software) can manage an array of disk drives
in such a way that data is protected in the event of a single
drive failure. By mapping segments of data across an array of
disk drives, the array appears as one logical drive unit. By generating
parity bits and storing them on the array, data can be regenerated
if a disk drive fails. The manner in which data is placed onto
the array provides for increased performance in different data
structure environments. Today's controllers, as well as drives,
are much more sophisticated and typically incorporate fault tolerant
monitoring features (SAF-TE) and user-friendly firmware and software
for RAID management.
Data striping is the foundation of RAID. Disk drives in a RAID
group are partitioned into stripes, which may be as small as one
kilobyte or as large as several megabytes. The stripe size is
the amount of data that is written to one disk before moving on
to the next disk in the array. Data is divided up by the RAID
controller, according to the stripe size, and written across the
drives. To maximize performance RAID arrays should be configured
with stripe sizes that correspond to the average I/O request size
and the number of drives in the array (stripe width). As the stripe
size of an array is decreased, files are divided up into smaller
pieces and distributed across more drives. As the stripe size
of an array is increased, files are divided up into larger pieces
and distributed across fewer drives. In a perfect world if all
of your files were the same size, you would take the size of your
I/O request and divide it by the number of drives in the array
(less one for parity) to get your stripe size optimized for throughput.
Benefits of RAID
RAID protects your data in the unlikely event of a drive failure.
If a disk drive fails in a properly designed RAID system, network
clients are unaware of the incident and they continue on with
their work as if nothing happened. The RAID system continues to
perform read/write operations and if a hot spare is available
it automatically becomes part of the array and data that was on
the failed drive is automatically regenerated onto this new drive
in the array.
Without a RAID system, if a disk drive fails you may suffer the
following economic costs:
- Employee downtime costs
- Emergency service costs
- Data restoration costs
- Data re-entry costs
- Employee downtime during data re-entry cost
|
|
- Lost sales costs
- Lost customer costs
- Lost opportunity costs
- Intangible costs due to work day disruption
|
Think of a RAID system as an insurance
policy for your data.
A RAID system provides increased performance over other storage
systems by distributing I/O load evenly across disks with no system
management or application involvement.
Data striping balances the I/O load across all the disk drives
in an array. With multi-user operating systems like Windows NT,
Unix and Netware, that support overlapped disk I/O across multiple
drives, data striping keeps all the drives in the array busy and
provides for efficient use of storage resources. Striping provides
higher performance because all drives are involved as much as
possible.
Hardware and Software RAID
Hardware RAID is very efficient because it does not occupy host
system memory or consume CPU cycles. It functions independent
of the operating system. Hardware RAID is also highly fault tolerant
because the array logic is based in hardware and software is not
required to boot.
Software based RAID systems offer a lower cost but they utilize
system resources. They occupy host memory, consume CPU cycles
and are operating system dependent. Software is also required
for the array to boot and some implementations require a separate
boot drive not included in the array.
External RAID Subsystems and Internal Host Based
RAID
In this case the RAID controller resides in the host system. Host
based RAID offers high performance because disks can be striped
over multiple channels and transfer rates are increased.
In this case the RAID controller resides in the external enclosure
along with the disk drives. The RAID system is attached to a SCSI
or Fibre Channel host adapter in the host. External RAID systems
can easily be transferred to another host in the event of a host
failure.
RAID levels
RAID levels can be classified according to their method of handling
data. Here are just a few.
|
|
Description
|
Redundancy
|
Performance
|
Disk Utilization
|
Level 0
Data Striping |
By definition RAID level 0 is not RAID because
it does not provide data redundancy. Data is striped
at the block level across all drives without parity.
|
None. If a drive fails all data is
lost.
|
Highest performance because there is no
parity related overhead.
|
100%
|
Level 1
Disk Mirroring |
Data is written at the file level to a primary
disk and a secondary disk. Identical data is stored
on both disks.
|
A mirrored set of drives is created.
If a drive fails data is still available.
|
High performance in read intensive applications.
If one drive is busy data can be accessed from the secondary
disk. Medium write performance.
|
50%
|
Level 1+0
Data Striping w/ Mirroring |
Also referred to as RAID level 10, RAID
1+0 is a combination of RAID levels 0 and 1 by striping data
across multiple mirrored pairs of disk drives.
|
Data is striped over mirrored sets. Can
sustain loss of more than one disk as long as they are not
in same mirrored set.
|
High performance because there is no parity
related overhead data is striped.
|
50%
|
Level 3
Data Striping w/ dedicated parity and Parallel Access |
Data is segmented at a byte level across
all disk drives of a RAID set, with one drive dedicated for
parity. Data is accessed in parallel.
|
One drive is dedicated for parity.
Data is regenerated in the event of a drive failure.
|
High read performance in data intensive
applications because data is accessed in parallel. High
transfer rates/low transaction rates. Poor write performance
in multi-user environments. Write operations cannot
be overlapped.
|
(N-1)/N (N= # of
disks)
|
Level 4
Data Striping w/ dedicated parity and Independent Access |
Data is striped at a block level across
an array of disk drives with one drive dedicated to parity.
Data is accessed independently instead of in parallel.
|
One drive is dedicated for parity.
Data is regenerated in the event of a drive failure.
|
High read performance in transaction intensive
applications that require high read requests because data
is accessed independently. Poor write performance in
multi-user environments because write operations cannot be
overlapped.
|
(N-1)/N (N= # of
disks)
|
Level 5
Data Striping w/ distributed parity and Independent Access |
Data is striped across a group of disk drives
with distributed parity. Parity information is written
to a different disk in the array for each stripe.
|
Parity is distributed across the disks in
the array. Data is regenerated in the event of a drive
failure.
|
High read performance in multiprocessing
environments because there is no contention for the parity
disk and I/O operations can be overlapped
|
(N-1)/N (N= # of
disks)
|
Which RAID Levels are right for my application?
is the fastest and most cost-effective
array type but provides no data protection. Reliability is actually
less than that of a single disk drive. RAID Level 0 is good for
high speed streaming of large file reads of non-critical data.
are good choices in environments
where performance and data protection are more important than
cost. RAID Levels 1 and 1+0 are good for hosting an operating
system, host applications, and high random write transaction databases.
is suitable for data intensive environments
where large blocks of data are being accessed sequentially. It
has faster write performance than RAID 5, but does not allow multiple
simultaneous write operations so it is not suitable for multi-user
environments.
is best suited for multi-user I/O intensive
environments where large amounts of small concurrent requests
are being performed. RAID Level 5 is the most commonly used RAID
level today especially in network environments where a performance
difference between striping and mirroring cannot be detected.
RAID Terminology
This feature provides the ability to perform a drive replacement
while the system is running and on-line. A disk drive can be disconnected
and removed or replaced without bringing the system down. This
is needed for the system to perform a data rebuild on-line in
the event of a drive failure and maintain 24-hour data availability.
These are disk drives that are dedicated as spares for a RAID
set and in the event of a disk failure they automatically become
on-line and replace the failed disk. Data is automatically regenerated.
These are disk drives that are available as spares for multiple
RAID sets and in the event of a disk failure from any RAID set,
they automatically become on-line and replace the failed disk.
Data is automatically regenerated.
When a disk drive in an array fails and is replaced, the RAID
controller regenerates the data that was on the failed disk by
utilizing the parity bits to calculate the lost data and write
it to a replacement drive. During the time between when the disk
failed and the new one is regenerated the controller continues
to read data that was on the failed disk by utilizing the parity
to calculate and respond to the read request. Also during this
time the RAID controller performs write operations to the failed
drive by temporarily writing the information to another drive
and when the regeneration is complete it then transfers the information
to the new drive.
Data is automatically regenerated to a hot spare disk in the event
of a disk failure.
The size of the stripe partition on the drives in an array can
be user defined to optimize performance for a particular application.
A caching strategy in which the RAID controller anticipates the
next request for data and instructs the drives to read the data
adjacent to the requested data and put it in cache. When that
data is requested it is read from cache and performance is improved.
This is recommended for data that is read sequentially and not
randomly.
A caching scheme in which the RAID controller sends a completion
status to the host operating system as soon as the cache receives
the data during a write operation. The target drive will receive
the data at a more convenient time in order to improve performance.
If a power outage occurs, all data in cache is lost unless a UPS
or battery backup module on the RAID controller is in use.
A caching scheme in which data is written to the disk drive before
a completion status is sent to the host operating system. This
is a lower performing caching scheme than Write-Back
|