# RAID Arrays Explained



## CrAsHnBuRnXp (Nov 2, 2007)

I have compiled a list of RAID arrays so that people are more aware of what each one does and if it would be useful for them. I will provide advantages and disadvantages of each RAID array given. First, a brief history about RAID.

History of RAID: The idea of RAID (Redundant Array of Inexpensive [or independent] Disks) was designed as a fix to a problem of bestowing a high capacity storage combined with data availability and redundancy. In the past when hard drives capacities were limited and higher capacity drives were expensive, RAID offered little data protection and redundancy. Compounding the problem, CPU processor performance was increasing at an exponential rate, while disk subsystems were quickly falling behind and creating a bottleneck for server performance. 

Back in 1988, a few researchers from the University of California-Berkeley came up with a set of guidelines for the original implementation of RAID. These guidelines would be referred to as RAID-1 through RAID-6. The various RAID levels do NOT mean that RAID 6 would be better than say RAID-1 or RAID-5. Your needs will determine what RAID level is best for your current situation. 

Now, when using RAID, it’s recommended to use the same size drives. You can in fact, use various size drives in any given array, but the array will take the form of the smallest hard drive and the rest of the unallocated space on the larger hard drive will not be use and in fact just be a waste. For example, if you wanted to setup a RAID-0 array, it's recommended to have a minimum of two drives of the same size such as 2x80GB. Whenever using any RAID array, the array will combine the number of drives you are using and make it one entire drive. So if you were to RAID-0 2x80GB hard drives, this would make a 160GB hdd. (Will be less when you factor in the formatting)

*RAID-0 - Data Striping w/out parity– 2 disk minimum:*
Provides improved performance to that of a single, non-RAID-0 drive, and provides additional storage space to work with. This RAID array breaks down the information stored on the hard drive into blocks which are stored on each corresponding RAID-0 hard drive. 

Array size: Size of Smallest Drive x Number of Drives

Advantages: This particular array is the easiest to implement, cheapest to implement, and most all controllers will support the use of RAID-0. Can make boot times quicker and make applications load faster. 

Disadvantages: Not fault tolerant. In other words if one drive fails, all data is lost. 

Recommendation: Do not use it in an environment where data is of the utmost importance such as a law firm or school corporation. If you implement this array, it is HIGHLY recommended that you schedule daily or weekly backup. (Preferably every couple of days or whenever you add new data) I would not use more than four drives either because you run the risk of losing data. One is better off to install RAID-0 in an environment that where applications require a high amount of performance such as gaming or working with digital imaging. Backup is required so that way if one (or all) drives fail, you can recover from the failure. 

*RAID-1 – Mirroring & Duplexing – 2 disk minimum without parity:*
Set of two disks or more that more or less mirror one another. Meaning the data being written to the primary disk it is being duplicated on the secondary disk (or all other disks in the array). Data is written to all disks at the same time and can be read from each disk separately. Thus enhancing read time. The transfer rate per written block is equal to that of a single disk. If the primary disk in the array fails, the array can be configured to use the mirrored copy on one of the other disks in the array until you can replace the failed hard drive. After which, the data can be restored into the new drive from the other remaining drives in the array. This is NOT a substitute for backups. 

Array size: takes the size of the smallest drive.

Advantages: 100% redundant. In other words if a single drive is lost to a failure, you will not lose data. RAID-1 can withstand multiple drive failures. RAID-1 is another simple array setup to implement. 

Disadvantages: One of the lease efficient RAID arrays. 

Recommendation: Best used in an environment that requires high read performance such as accounting, company payroll, or financial situations. You are still highly recommended to backup your data. 

*RAID-2 – Hamming Code ECC – 1 or more disks:*
This RAID array performs disk striping at the bit level. The error-checking and correction can only be supported with a certain kind of hard drive. When a hard drive read occurs, the data on the drive is checked with the ECC codes to establish that everything is correct. If it happens to be incorrect, the data is corrected on the “fly”.

Array size: Varies

Advantages: Fault tolerant, “on the fly” data correction, high data transfers, simpler RAID design compared to RAID-3, 4, and 5. 

Disadvantages: Not commercially available, high entry level cost, and it requires a high transfer rate. 

Recommendation: Best left for business purposes. 

*RAID-3 – Parallel Transfer (Striping) with Parity – 3 disk minimum:*
Data is divided amongst and written to the separate hard drives. The parity is generally made on writes, written to the parity drive, and checked on the read. If a disk happens to fail, then the data is restored across the striped array using the parity information that was written to one of the other hard drives. The performance of the disk reads in RAID-3 is that of a RAID-0 implementation. If you add more drives to increase the total size of the RAID-3 array, then the parity size of the drive must also be increased so that it can match or surpass the physical size of the individual array drives. 

Array size: Size of Smallest Drive x Number of Drives - 1

Advantages: Fault tolerant, high read and write of data transfer, disk failure has an exiguous amount of impact, and has a high efficiency. 

Disadvantages: Difficult and resource intensive if used in software RAID, complex, and the transaction rate is equal to a single hard drive (so long as the spindles are in sync)

Recommendation: Video production and or live streaming, Editing of Image and Video and any other application requiring high throughput/best for applications that require sequential data reads. 

*RAID-4 – Independent Data Disks w/ Shared Parity – 3 disk minimum:*
It is similar to RAID-3 in that it contains a number of striped disks and it has a separate parity disk. However, the size of the striping block is bigger to reconcile more data. This is what makes RAID-4 similar to RAID-3 in that it has basically the same implementation, but it removes the bottlenecks that affected the transactional data in RAID-3. 

Array size: Size of Smallest Drive x Number of Drives – 1

Advantages: High read rate, high aggregate read, Low parity (high efficiency)

Disadvantages: The worst write rate, worst write aggregate rate, difficult to rebuild in the event of a hard drive failure, block read rate is that of a single disk, not commercially available

Recommendation: Not a recommended use. There are better options to choose from. 

*RAID-5 – Striping with Parity – 3 disk minimum:*
This is the most widely used RAID array used today. What RAID-5 does is the parity information gets distributed amongst all drives within the array unlike RAID-3 or 4. A certain amount of total disk space becomes unavailable on the array so that the parity data can be written to disk. Usually, the amount of drive space given for parity information is equal to the size of one entire drive in the array. Example, an array of 4x10GB drives would give you approximately 30GB of space for your data while the left over 10GB would be reserved for the parity information. 

Array size: Size of Smallest Drive x Number of Drives - 1

Advantages: Fault tolerant, read speeds are quite high, high efficiency, good transfer rate

Disadvantages: Disk failure has a medium impact on the array (meaning you can only sustain one drive failure at a given time), has the most complex design, difficult to rebuild after a disk failure

Recommendation: File servers, database servers, Web servers, Email servers, Intranet servers, etc. 

*RAID-6 – Striping with Double Parity – 4 disk minimum plus a proprietary RAID controller:*
RAID-6 is the exact same thing as RAID-5, but it offers double the parity of RAID-5 so that way you can sustain a two disk failure and still retain your data. 

Array size: Size of Smallest Drive x Number of Drives - 2 

Advantages: Fault tolerant, can sustain a two disk failure, perfect for a mission critical environment

Disadvantages: More complex, controller overhead for the parity is very high, 

Recommendation: File servers, database servers, Web servers, Email servers, Intranet servers, etc. 

Please note that neither RAID array is a preventative from doing regular backups. Backups are still highly recommended in case of an unforeseeable event.


----------



## d44ve (Nov 2, 2007)

Very nice... at first I thought you stole this from here : http://www.hardwareanalysis.com/content/topic/67628/

But then I noticed you were the same person!


----------



## Hawk1 (Nov 2, 2007)

Very nice write up. Now you just need to add raid 10 (I think thats the last one). Anyway, I think this is stickyable.

Edit: lol D44ve, I was about to tell you that when you deleted/edited your post. But nice quick catch.


----------



## d44ve (Nov 2, 2007)

RAID 10?


----------



## Hawk1 (Nov 2, 2007)

d44ve said:


> RAID 10?



I think it combines Raid 1 and 0 into an array. Let me do a quick search.

Edit: see here


----------



## Ben Clarke (Nov 2, 2007)

That's RAID 1+0...


----------



## Disparia (Nov 2, 2007)

d44ve said:


> RAID 10?



Nested arrays.

RAID-10 is a RAID-0 of RAID-1 arrays.

RAID-01 is a RAID-1 of RAID-0 arrays.

You can find RAID-51, 50, and some others on certain controllers.


----------



## d44ve (Nov 2, 2007)

I gotcha.... I have always just called it 1+0 or 0+1 or whatever way you want to go


----------



## Disparia (Nov 2, 2007)

Luckily we're only up to RAID-7 (AFAIK), so we can do 8 and 9 before having to change the terminology


----------



## Deleted member 3 (Nov 2, 2007)

Really, check your info on RAID2, it's obsolete, there is no use for it. It's advantage is builtin in every modern disk nowadays. It is not a simple array either, it requires a mad amount of disks for no apparent reason.


----------



## Mediocre (Nov 2, 2007)

Or you could say thanks for the great information and contribution to the forums...here's some additional info...

Thanks for the compilation...did you add it to the wiki??


----------



## CrAsHnBuRnXp (Nov 2, 2007)

Mediocre said:


> Thanks for the compilation...did you add it to the wiki??



No I did not.


----------



## CrAsHnBuRnXp (Nov 2, 2007)

DanTheBanjoman said:


> Really, check your info on RAID2, it's obsolete, there is no use for it. It's advantage is builtin in every modern disk nowadays. It is not a simple array either, it requires a mad amount of disks for no apparent reason.



I know its an obsolete array, but I figured I would throw it in anyway.


----------



## Mediocre (Nov 2, 2007)

I'd throw it up here: http://reference.techpowerup.com/Category:Storage

and make a 'RAID' category


----------



## CrAsHnBuRnXp (Nov 2, 2007)

Mediocre said:


> I'd throw it up here: http://reference.techpowerup.com/Category:Storage
> 
> and make a 'RAID' category



There appears to already be a "RAID" category, but it lacks sufficient information.


----------



## Mediocre (Nov 2, 2007)

I SWEAR that wasn't in there 2 minutes ago 

Oh well, I suppose editing (and not creating from scratch) maybe more work than its worth

Ahh well, thanks anyway


----------



## CrAsHnBuRnXp (Nov 2, 2007)

Mediocre said:


> I SWEAR that wasn't in there 2 minutes ago
> 
> Oh well, I suppose editing (and not creating from scratch) maybe more work than its worth
> 
> Ahh well, thanks anyway


 No problem.


----------



## surfsk8snow.jah (Nov 3, 2007)

*Very Nice*

VERY nice bro. This should DEFINITELY be stickied. You responded to my suggestion damn fast, good reaction time 
I do feel slightly proud in that at least I suggested the idea haha. But mad props for following through so thoroughly.

Now whenever a thread starts or ends up on RAID, we just point them here. Sweet.

Oh, and I still think someone should do a comprehensive benchmark read/write test of 5 Identical HDDs in every configuration of RAID possible, with both onboard and PCI Raid Controllers, to have an absolute performance comparison, instead of so many opinions and scattered recommendations on which RAID array to use. Of course then apply a fault tolerance bullet to each benchmark. That would make TPU a hotspot for sure... you know how many search results you get asking "Which RAID array do I use!?" haha, including myself.


----------



## tkpenalty (Nov 3, 2007)

Very nice guide man... Cleared some stuff up about the difference between 0+1 and 1+0 and 10, Sticky please.


----------



## ex_reven (Nov 3, 2007)

Nice guide, very well written. Concise.


----------



## CrAsHnBuRnXp (Nov 3, 2007)

surfsk8snow.jah said:


> VERY nice bro. This should DEFINITELY be stickied. You responded to my suggestion damn fast, good reaction time
> I do feel slightly proud in that at least I suggested the idea haha. But mad props for following through so thoroughly.
> 
> Now whenever a thread starts or ends up on RAID, we just point them here. Sweet.
> ...



Thanks. I appreciate it all the feedback. 

If I could I would do the benchmarks for the hard drives, but I dont have enough spare hard drives to do that.


----------



## CrAsHnBuRnXp (Nov 3, 2007)

ex_reven said:


> Nice guide, very well written. Concise.



Thank you very much.


----------



## CrAsHnBuRnXp (Nov 12, 2008)

Bump from the 1yr 1w and 2d grave. 

Sticky?


----------



## AsRock (Nov 12, 2008)

I vote for a sticky.


----------



## newtekie1 (Nov 12, 2008)

One advantage of RAID-1 that you missed is that you can usually take a drive from a failed controller and connected it to any other controller and get the data.

Disadvantages of RAID 0 and 5 is that if the controller fails or you want to switch controllers(I.E. swap a motherboard if you are using onboard) then the array usually won't work on the new controller, so the data is lost.


----------



## AphexDreamer (Nov 12, 2008)

Wait can someone explain to me Strip Size??? 64K 128K which is better?


----------



## IggSter (Nov 12, 2008)

AphexDreamer said:


> Wait can someone explain to me Strip Size??? 64K 128K which is better?



Simplest way to explain this is:

When you create a raid array the disks are segmented into blocks, this is the minimum amount of space a file will take up:

So if you have block size of 64k and write a 2k file, the file will use up 64k of space, if you use block size of 128k the file will take up 128k


So why bother:

If you have a file which is 64000k is will take up 1000 blocks with 64k block size, and 500 with 128k blocks.
Now since each block has its own location on the disk with 128k block size the disk will seek 50% less

So in short: 
Small block/stripe size = best use of space but more seeks on the disk (very good if you are storing lots and lots  of very small files or are using a database application that makes lots and lost of small reads)
Large block/stripe size = more wasted space but saves on disk seeks (Very good if you are storing large files and want to minimise the effects of disk access times (seeks))


----------



## CrAsHnBuRnXp (Aug 1, 2010)

Bump for a sticky?


----------



## 7.62 (Aug 1, 2010)

That would have to be the best explanation on the entire internet.

Well done IggSter


----------



## Disparia (Aug 4, 2010)

You need not worry about wasted space when choosing a stripe size. Stripe ≠ cluster.

While yes the controller works with data at these sizes, the OS does not. It's not even aware that it's on an array of any sort.


----------



## freaksavior (Aug 4, 2010)

This needs to be a sticky and moved to storage


----------



## CrAsHnBuRnXp (Aug 4, 2010)

Ive been saying sticky since i first wrote this.


----------



## Solarsails (Apr 5, 2019)

IggSter said:


> Simplest way to explain this is:
> 
> When you create a raid array the disks are segmented into blocks, this is the minimum amount of space a file will take up:
> 
> ...


RAID STRIPE SIZE AND SLACK SPACE 

There is a misconception with regard to space usage and stripe size. There seem to be a few people creating posts with the concern that large stripe sizes are potentially "wasteful" if there are many small files to be written to a RAID array. Such statements are woefully inaccurate, to put it mildly.

In the majority of cases stripe size has absolutely no bearing on space usage (unlike cluster size).  If a "stripe" isn't fully populated (filled) by a single file then the next file, or files, will be written into that stripe's space until the stripe is completely "filled".  In other words a single stripe can hold multiple files and stripes are always completely filled before disk writes continue on to the next stripe, which is on the next disk. The misconception  of space being "wasted" by configuring a raid array with large sized stripes is more relevant to cluster or allocation unit size but is completely inaccurate with regard to stripe sizes. Stripes are always completely filled and if a stripe isn't filled on the first pass it will eventually be used during the next write(s) until it is filled.  

For example: The assertion is that if you choose a 64kb stripe size and you store a 2kb text file then that file will be written to a stripe and no other files will be able to be stored in that stripe thereby wasting 62kb of disk space. This concept is as wrong as wrong can be.

The only incidence of disk space being used in this manner is with regard to FILESYSTEM BLOCKS. As files are stored within a disk's filesystem they are written into an available block and as each block written to becomes filled writing continues to the next available block until the file is completely written to the disk and in many instances the last block written will only be partially filled which results in "slack space". OR if a file is smaller than the available block being written to this will also result in slack space. Due to the nature of the method that files are stored on disk this slack space can not used in future writing events to the disk and is considered to be "wasted" space.

This is not the case with regard to "stripes". To the OS a RAID array appears as any other storage device and multiple filesystem blocks may reside in a single stripe element.

As an aside you will notice better performance if you manage to match the filesystem block size to the raid stripe size and only in that arrangement is it possible for a stripe to have wasted slack space.

Summary: If a stripe isn't fully populated (filled) by a single file then the next file, or files, will be placed into that stripe until it is completely "filled". In other words a single stripe can hold multiple files and stripes are always completely filled before disk writes continue on to the next stripe which is on the next disk. Stripes are always completely filled and if a stripe isn't filled on the first pass it will eventually be used during one of the corresponding write(s) until that area of disk is filled. 

RAID STRIPE SIZE AND SLACK SPACE by Solarsails


----------



## Shrek (Feb 15, 2022)

CrAsHnBuRnXp said:


> RAID (Redundant Array of Inexpensive [or independent] Disks)
> 
> *RAID-0 - Data Striping w/out parity– 2 disk minimum:*
> Provides improved performance to that of a single, non-RAID-0 drive, and provides additional storage space to work with. This RAID array breaks down the information stored on the hard drive into blocks which are stored on each corresponding RAID-0 hard drive.
> ...



Nice to add that RAID-0 is not really a RAID as there is no redundancy.


----------

