[CentOS] cyrus spool on btrfs?

Fri Sep 8 21:08:18 UTC 2017
Gordon Messmer <gordon.messmer at gmail.com>

On 09/08/2017 11:06 AM, hw wrote:
> Make a test and replace a software RAID5 with a hardware RAID5.  Even 
> with
> only 4 disks, you will see an overall performance gain.  I´m guessing 
> that
> the SATA controllers they put onto the mainboards are not designed to 
> handle
> all the data --- which gets multiplied to all the disks --- and that the
> PCI bus might get clogged.  There´s also the CPU being burdened with the
> calculations required for the RAID, and that may not be displayed by 
> tools
> like top, so you can be fooled easily. 


That sounds like a whole lot of guesswork, which I'd suggest should 
inspire slightly less confidence than you are showing in it.

RAID parity calculations are accounted under a process named 
md<number>_raid<level>.  You will see time consumed by that code under 
all of the normal process accounting tools, including total time under 
"ps" and current time under "top". Typically, your CPU is vastly faster 
than the cheap processors on hardware RAID controllers, and the 
advantage will go to software RAID over hardware.  If your system is CPU 
bound, however, and you need that extra fraction of a percent of CPU 
cycles that go to calculating parity, hardware might offer an advantage.

The last system I purchased had its storage controller on a PCIe 3.0 x16 
port, so its throughput to the card should be around 16GB/s.  Yours 
might be different.  I should be able to put roughly 20 disks on that 
card before the PCIe bus is the bottleneck.  If this were a RAID6 
volume, a hardware RAID card would be able to support sustained writes 
to 22 drives vs 20 for md RAID.  I don't see that as a compelling 
advantage, but it is potentially an advantage for a hypothetical 
hardware RAID card.

When you are testing your 4 disk RAID5 array, microbenchmarks like 
bonnie++ will show you a very significant advantage toward the hardware 
RAID as very small writes are added to the battery-backed cache on the 
card and the OS considers them complete.  However, on many cards, if the 
system writes data to the card faster than the card writes to disks, the 
cache will fill up, and at that point, the system performance can 
suddenly and unexpectedly plummet.  I've fun a few workloads where that 
happened, and we had to replace the system entirely, and use software 
RAID instead.  Software RAID's performance tends to be far more 
predictable as the workload increases.

Outside of microbenchmarks like bonnie++, software RAID often offers 
much better performance than hardware RAID controllers. Having tested 
systems extensively for many years, my advice is this:  there is no 
simple answer to the question of whether software or hardware RAID is 
better.  You need to test your specific application on your specific 
hardware to determine what configuration will work best.  There are some 
workloads where a hardware controller will offer better write 
performance, since a battery backed write-cache can complete very small 
random writes very quickly.  If that is not the specific behavior of 
your application, software RAID will very often offer you better 
performance, as well as other advantages.  On the other hand, software 
RAID absolutely requires a monitored UPS and tested auto-shutdown in 
order to be remotely reliable, just as a hardware RAID controller 
requires a battery backed write-cache, and monitoring of the battery state.