On 03/06/2013 08:00 AM, Mark Snyder wrote:
- Avoid software RAID5 or 6, only use it for RAID1 or 10. Software
RAID5 performance can be abysmal, because of the parity calculations and the fact that each write to the array requires that all drives be read and written.
My understanding of Linux mdadm RAID5 is that a write will read the block being written and the parity block. The calculations can be done with only those blocks, and the two are written. That's one extra read per write plus parity calculations.
I'm quite certain that I've seem some hardware RAID arrays that will read the entire stripe to do a write.
RAID5 will always write more slowly than RAID1 or RAID10, but that can sometimes be acceptable if capacity is more important than performance.
Older hardware raid controllers can be pretty cheap on eBay, I'm using an old 3Ware on my home CentOS server.
If there's anything to avoid, it'd be old 3ware hardware. Those cards are often less reliable than the disks they're attached to, and that's saying something.
Avoid hostraid adapters, these are just software raid in the controller rather than the OS.
All hardware raid is "just software raid in the controller rather than the OS". The advantages of hardware RAID are offloading parity calculations to dedicated hardware so that the CPU doesn't need to do it, and a battery backed write cache.
The write cache is critical to safely writing a RAID array in the event of a power loss, and can greatly improve performance provided that you don't write enough data to fill the cache.
The host CPU is very often faster with parity than the dedicated hardware, which is why Alan Cox has been quoted as saying that the best RAID controllers in the world are made by Intel and AMD. However, if you think you need the couple of percent of CPU cycles that would have been used by software RAID, you might prefer the hardware solution.
If you are using drives over 1TB, consider partitioning the drives into smaller chunks, say around 500MB, and creating multiple arrays. That way if you get a read error on one sector that causes one of the raid partitions to be marked as bad, only that partition needs to be rebuild rather than the whole drive.
If you have a disk on which a bad sector is found, it's time to replace it no matter how your partitions are set up. Drives reserve a set of sectors for re-mapping sectors that are detected as bad. If your OS sees a bad sector, it's because that reserve has been exhausted. More sectors will continue to go bad, and you will lose data. Always replace a drive as soon as your OS sees bad sectors, or before based on SMART data.
Partitioning into many smaller chunks is probably a waste of time. Like most of the other participants in this thread, I create software RAID sets of one or two partitions per disk and use LVM on top of that.
Hopefully BTRFS will simplify this even further in the near future. :)