RAID Systems (Redundant Array of Independent Disks)

Introduction

More than simply store data, solutions storage should provide access to information efficiently, in a timely manner and, depending on the case, offering some kind of protection against failures. Is at this point that the systems RAID (Redundant Array of Independent Disks) come into action.

In the next lines, the AbbreviationFinder will explain what is RAID and will show what your main levels.

What is RAID?

As already mentioned, RAID is the acronym for Redundant Array of Independent Disks or, in free translation, something as “Redundant Array of Independent Disks”. It basically, a solution to the computational that combines several hard disks (HDs) to form a single the logical unit of data storage.

And what is logical drive? In a few words, in regarding the RAID, it’s to do with the operating system to see the set of hard Drives as a single storage unit, independent of the number of devices that are in use. Today, in addition to HDs, it is possible to mount RAID systems based on SSD.

Make various storage units to work in set results in many possibilities:

– If a HD suffer damage, the existing data on it will not be lost, for it can be replicated on another drive (redundancy);

– It is possible to increase the storage capacity of the any time with the addition of more hard Drives;

– Access to information can become more fast, because the data are distributed to all disks;

– Depending on the case, there is a greater fault tolerance, because the the system is not paralyzed if a drive stop work;

– A RAID system can be cheaper than a device storage more sophisticated and, at the same time, offer almost the same results.

RAID levels

So that a RAID system is created, it is necessary to use at least two hard Drives (or SSDs). But it is not only this: it is necessary to set also the RAID level of the system. Each level has distinct characteristics precisely to meet the most a variety of requirements. The following are the levels most common:

RAID 0 (zero)

Also known as striping (fractionation)), the RAID 0 is the one where the data are divided into small segments and distributed among the disks. This is a level that does not offer protection against flaws, since in him there is no redundancy. This it means that a failure in any one of the disks can cause the loss information for the whole system, especially because “bits and pieces” of the same file can be stored on disks different.

The focus of the RAID 0 ends up being the performance, since the system virtually sum the data transmission speed of each drive. Thus, at least theoretically, the more discs there are in the system, the greater is its transfer rate. Not it is difficult to understand why: as the data are divided, each part of a file is recorded in different units at the same time. If this process happened only on a single HD, the recording would be a little more slow, as it would have to be done sequentially.

By having these characteristics, RAID 0 is very used in applications that deal with large volumes of data and may not exhibit the slowness, such as the treatment of images and editing videos.

RAID 1

The RAID 1 is probably the most well-known. In it, a unit “doubles” the other, that is, makes a “copy” of the first, the reason for which the level is also known as mirroring (mirroring). With this, if the main disk fails, the data can be retrieved immediately because there is copies in the other.

Note that, on account of this characteristic, RAID 1 should work in pairs, so that a unit always has a”clone”. In practice, this means that a RAID system composed for two hard Drives with 500 GB each will have exactly this ability, in instead of 1 TB.

The RAID level 1 is clearly focused on the protection of the data, that is, does not make the access faster. In fact, it may even occur a slight performance loss, since the process of write ends up having that happen twice, once in each unit.

It is important to note, however, that the use of RAID 1 does not dispense solutions backup. As the replication of data is done virtually in real time, means that information that is undue is recorded in the first unit (such as a virus) or if an important file is erased by mistake, the same will happen on the second disk. For this reason, RAID 1, if it shows more adequate to protect the system from faults “physical” of the units.

RAID 0+1 and RAID 10

As you might have already imagined, the the level RAID 0+1 is a “hybrid” system (hybrid RAID), that is, that combines RAID 0 with RAID 1. For this, the system needs to have at least four storage units, two for each level. Thus, you have a RAID solution that considers both the appearance of the performance and the redundancy.

There is a variation called RAID 10 (or RAID 1+0) functioning similar. The essential difference is that, in the RAID 0+1, the system turns into RAID 0 in case of failure; in the RAID 1+0, the system defaults to the RAID level 1.

RAID 5

The RAID 5 is on another level quite well known. In it, the aspect of redundancy is also considered, but different way: instead of having a storage unit whole as a replica, the records serve as protection. This way, you can even mount the system with quantity odd units. But, as this is possible? With the use of a scheme of parity.

In this protection method, the data are divided into small blocks. Each one of them receives a bit additional – the parity bit – according to the following rule: if the amount of bit ‘1’ of the block is even, its parity bit is ‘0’; if the amount of bit ‘1’ is odd, the parity bit is ‘1’.

The parity information – as well as the own the data – are distributed among all the disks on the system. As a rule, the space dedicated to parity is equivalent to the size of one of the disks. Thus, an array formed by three HDs of 500 GB will have 1 TB to storage and 500 GB for the parity.

From there, if on a task, check the system seen, for example, that the parity bit of a block is ‘1’, but there is a quantity pair of bits, you realize that you there is an error. If there is only one bit with the problem and if the system be able to identify you, you’ll be able to replace it immediately. The restoration of the data may be made even after the HD have been swapped.

As an example, imagine a block of data with the bits ‘110X’, and parity ‘1’. The X indicates a bit lost, but it will be that he is a ‘0’ or ‘1’? As the parity is ‘1’, it means that the block is composed by amount odd bit ‘1’. So, if X was ‘0’, the parity should also be ‘0’ since there would be no amount pair of bit ‘1’. This means that the bit X can only be ‘1’.

During the replacement, it is possible to maintain the system in operation, mainly with the use of equipment that support hot-swaping, that is, the exchange of components without the need shutdown of the computer. This is possible because the data are distributed among all the disks. If ato fail, the schema of parity allows you to recover the data from the existing information in other units.

RAID 6

RAID 5 is an option quite interesting systems that need to combine redundancy with costs (relatively) low, but has a limitation considerable: you can protect the system if only one disk crash.

One way to handle this is by appending a resource name hot-spare to the system. This is a scheme where one or more disks are added to the reservation, entering action as soon as a drive has problems.

Another interesting alternative is the use of RAID 6. It is a most recent specification and similar to RAID 5, but with an important difference: it works with two bits of parity. With this, it is possible to provide redundancy for up to two HDs in the system, instead of just one.

RAID 2, 3 and 4

The RAID levels shown until now are the most used, but there are some less well-known, among them, the RAID 2, RAID 3 and RAID 4:

The RAID 2

RAID is a type of storage solution that appeared at the end of the 1980s. At that time, and in the years following, the HDs were not the same standard reliability that they have today. For this reason, we created the RAID 2. It is, to a certain extent, similar with the RAID 0, but account with a mechanism for detection of failures of the type ECC (Error Correcting Code). Today, this level is almost not it is the most used, since virtually all of the HDs include with the above-mentioned feature.

RAID 3

This is a level similar to RAID 5 by using parity. The main difference is that the RAID 3 reserves a storage unit just to save the parity information, which is why you need at least three disks to mount the system. This level can also be greater complexity of implementation due to the fact the operations of writing and reading of data to consider all of the disks rather than treating them individually.

RAID 4

RAID 4 also uses the schema of parity, having operation similar to RAID 3, with the differential split the data in larger blocks and offer individual access to each disk of the system.

This level can present some impairment of performance, because any write operation requires update on the drive from parity. For this reason, your use is more suitable in systems that prioritize the reading of data, namely, that carry out a lot more queries that write.

JBOD (Just a Bunch Of Disks)

When the subject is RAID, you can also listen to speaking of JBOD, short for Just a Bunch Of Disks(something like “Only a Set of Disks”). This is not a RAID level, but a method that simply allows the use in conjunction two or more hard Drives (regardless of capacity) to do the operating system sees the arrangement as a single the logical drive.

In fact, JBOD is similar to RAID, but does not have focus in performance or redundancy, considering only the increase of the storage capacity. Here, the data are simply recorded and, when a disk is filled to capacity, the operation continues on the other. This way, if a HD suffer damage, the data existing in the other are not harmed.

RAID implementation

In the past, build the systems RAID was not a task of the most simple and its use is normally limited to servers. Today, in the however, it is possible to implement them until even in personal computers, the same as practically any modern operating system (Windows, Linux, Mac OS X, etc.)supports this feature.

The easiest way to do this is by acquiring a the motherboard has a RAID controller. In a few words, this device that can work with interfaces PATA, SATA or SCSI, identifies the storage units connected and make it work as a RAID system. Your configuration is usually done from the setup in the BIOS, although some control software can be provided to work on the operating system.

If the motherboard does not have RAID controller, it is possible to add plates that add this function. These devices can usually be found using PCI interface or PCI Express. The card below is an example. It is connected to to a computer through a PCI Express slot and has four SATA connectors. Are them that the HDs (or SSDs) that will be part of the RAID system should be linked to:

A RAID system can also be implemented via software, without the the need for controllers. In these cases, the management all it is done from the operating system, therefore, is necessary to count with a good configuration hardware for the computer is not overloaded.

An important note: on the motherboards, is common to find RAID controllers that in fact, blend resources software available from operating system with some features that may be enabled via the BIOS. In these cases, the performance of the RAID system is usually be lower in comparison to what can be offered by a parent company “of the truth”.

Ending

RAID is not a new invention. Appeared in 1987 by the hands of David Patterson, Garth Gibson , and Randy Katz, at the time, researchers from the University of California, Berkeley, in the United United. The question that hangs in the air is: a technology with both the time of existence still has usefulness today? The the answer is a resounding YES.

Use RAID today can be much more advantageous than the years ago. First, because the costs have decreased. In the past only it was possible to do RAID with SCSI drives (more guys), for example. Currently, RAID controllers are a little cheaper, compatible with multiple interfaces and implementation relatively simple.

In addition, there are now more applications that benefit from this type of system. Soon, even with the emergence new data storage technologies, we will hear talk of RAID yet for a long time.

Raid Systems (Redundant Array of Independent Disks) 1