[CentOS] ssacli start rebuild?

Sat Nov 14 21:37:01 UTC 2020
Warren Young <warren at etr-usa.com>

On Nov 14, 2020, at 5:56 AM, hw <hw at gc-24.de> wrote:
> 
> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
>> On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote:
>>> I have yet to see software RAID that doesn't kill the performance.
>> 
>> When was the last time you tried it?
> 
> I'm currently using it, and the performance sucks.

Be specific.  Give chip part numbers, drivers used, whether this is on-board software RAID or something entirely different like LVM or MD RAID, etc.  For that matter, I don’t even see that you’ve identified whether this is CentOS 6, 7 or 8.  (I hope it isn't older!)

> Perhaps it's
> not the software itself or the CPU but the on-board controllers
> or other components being incable handling multiple disks in a
> software raid.  That's something I can't verify.

Sure you can.  Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays.

In a 2-disk array, a proper software RAID system should give 2x a single disk’s performance for both read and write in RAID-0, but single-disk write performance for RAID-1.

Such values should scale reasonably as you add disks: RAID-0 over 8 disks gives 8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc.

These are rough numbers, but what you’re looking for are failure cases where it’s 1x a single disk for read or write.  That tells you there’s a bottleneck or serialization condition, such that you aren’t getting the parallel I/O you should be expecting.

>> Why would you expect that a modern 8-core Intel CPU would impede I/O
> 
> It doesn't matter what I expect.

It *does* matter if you know what the hardware’s capable of.

TLS is a much harder problem than XOR checksumming for traditional RAID, yet it imposes [approximately zero][1] performance penalty on modern server hardware, so if your CPU can fill a 10GE pipe with TLS, then it should have no problem dealing with the simpler calculations needed by the ~2 Gbit/sec flat-out max data rate of a typical RAID-grade 4 TB spinning HDD.

Even with 8 in parallel in the best case where they’re all reading linearly, you’re still within a small multiple of the Ethernet case, so we should still expect the software RAID stack not to become CPU-bound.

And realize that HDDs don’t fall into this max data rate case often outside of benchmarking.  Once you start throwing ~5 ms seek times into the mix, the CPU’s job becomes even easier.

[1]: https://stackoverflow.com/a/548042/142454

> 
>>> And where
>>> do you get cost-efficient cards that can do JBOD?
>> 
>> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
> 
> That says it's for HP.  So will you still get firmware updates once
> the warranty is expired?  Does it exclusively work with HP hardware?
> 
> And are these good?

You asked for “cost-efficient,” which I took to be a euphemism for “cheapest thing that could possibly work.”

If you’re willing to spend money, then I fully expect you can find JBOD cards you’ll be happy with.

Personally, I get servers with enough SFF-8087 SAS connectors on them to address all the disks in the system.  I haven’t bothered with add-on SATA cards in years.

I use ZFS, so absolute flat-out benchmark speed isn’t my primary consideration.  Data durability and data set features matter to me far more.

>>> What has HP been thinking?
>> 
>> That the hardware vs software RAID argument is over in 2020.
> 
> Do you have a reference for that, like a final statement from HP?

Since I’m not posting from an hpe.com email address, I think it’s pretty obvious that that is my opinion, not an HP corporate statement.

I base it on observing the Linux RAID market since the mid-90s.  The massive consolidation for hardware RAID is a big part of it.  That’s what happens when a market becomes “mature,” which is often the step just prior to “moribund.”

> Did they stop developing RAID controllers, or do they ship their
> servers now without them

Were you under the impression that HP was trying to provide you the best possible technology for all possible use cases, rather than make money by maximizing the ratio of cash in vs cash out?

Just because they’re serving it up on a plate doesn’t mean you hafta pick up a fork.