ssacli start rebuild?

List overview All Threads
Download

newer

older

mdadm raid-check

Centos 8 and logwatch

5 Nov 2020 5 Nov '20

11:51 p.m.

Hi,

is there a way to rebuild an array using ssacli with a P410?

A failed disk has been replaced and now the array is not rebuilding like it should:

Array A (SATA, Unused Space: 1 MB)

logicaldrive 1 (14.55 TB, RAID 1+0, Ready for Rebuild)

physicaldrive 1I:0:1 (port 1I:box 0:bay 1, SATA HDD, 4 TB, OK) physicaldrive 1I:0:2 (port 1I:box 0:bay 2, SATA HDD, 4 TB, OK) physicaldrive 1I:0:3 (port 1I:box 0:bay 3, SATA HDD, 4 TB, OK) physicaldrive 1I:0:4 (port 1I:box 0:bay 4, SATA HDD, 8 TB, OK) physicaldrive 2I:0:5 (port 2I:box 0:bay 5, SATA HDD, 4 TB, OK) physicaldrive 2I:0:6 (port 2I:box 0:bay 6, SATA HDD, 4 TB, OK) physicaldrive 2I:0:7 (port 2I:box 0:bay 7, SATA HDD, 4 TB, OK) physicaldrive 2I:0:8 (port 2I:box 0:bay 8, SATA HDD, 4 TB, OK)

I'd expect the rebuild to start automatically after 1I:0:4 was replaced. Is the new drive being larger than the old one (4-->8) causing issues?

Show replies by date

Thomas Bendler

6 Nov 6 Nov

11:08 a.m.

Am Fr., 6. Nov. 2020 um 00:52 Uhr schrieb hw hw@gc-24.de:

...

[...] logicaldrive 1 (14.55 TB, RAID 1+0, Ready for Rebuild) [...]

Have you checked the rebuild priority:

❯ ssacli ctrl slot=0 show config detail | grep "Rebuild Priority" ~ Rebuild Priority: Medium ❯

Slot needs to be adjusted to your configuration.

Kind regards Thomas

-- Linux ... enjoy the ride!

7:38 p.m.

On Fri, 2020-11-06 at 12:08 +0100, Thomas Bendler wrote:

...

Am Fr., 6. Nov. 2020 um 00:52 Uhr schrieb hw hw@gc-24.de:

...
[...] logicaldrive 1 (14.55 TB, RAID 1+0, Ready for Rebuild) [...]

Have you checked the rebuild priority:

❯ ssacli ctrl slot=0 show config detail | grep "Rebuild Priority" ~ Rebuild Priority: Medium ❯

Slot needs to be adjusted to your configuration.

Yes, I've set it to high:

ssacli ctrl slot=3 show config detail | grep Prior Rebuild Priority: High Expand Priority: Medium

Some search results indicate that it's possible that other disks in the array have read errors and might prevent rebuilding for RAID 5. I don't know if there are read errors, and if it's read errors, I think it would mean that these errors would have to affect just the disk which is mirroring the disk that failed, this being a RAID 1+0. But if the RAID is striped across all the disks, that could be any or all of them.

The array is still in production and still works, so it should just rebuild. Now the plan is to use another 8TB disk once it arrives, make a new RAID 1 with the two new disks and copy the data over. The remaining 4TB disks can then be used to make a new array.

Learn from this that it can be a bad idea to use a RAID 0 for backups and that least one generation of backups must be on redundant storage ...

Thomas Bendler

9 Nov 9 Nov

3:30 p.m.

Am Fr., 6. Nov. 2020 um 20:38 Uhr schrieb hw hw@gc-24.de:

...

[...] Some search results indicate that it's possible that other disks in the array have read errors and might prevent rebuilding for RAID 5. I don't know if there are read errors, and if it's read errors, I think it would mean that these errors would have to affect just the disk which is mirroring the disk that failed, this being a RAID 1+0. But if the RAID is striped across all the disks, that could be any or all of them.

The array is still in production and still works, so it should just rebuild. Now the plan is to use another 8TB disk once it arrives, make a new RAID 1 with the two new disks and copy the data over. The remaining 4TB disks can then be used to make a new array.

Learn from this that it can be a bad idea to use a RAID 0 for backups and that least one generation of backups must be on redundant storage ...

Just checked on one of my HP boxes, you can indeed not figure out if one of the discs has read errors. Do you have the option to reboot the box and check on the controller directly?

Kind regards Thomas

11 Nov 11 Nov

6:27 a.m.

On Mon, 2020-11-09 at 16:30 +0100, Thomas Bendler wrote:

...

Am Fr., 6. Nov. 2020 um 20:38 Uhr schrieb hw hw@gc-24.de:

...
[...] Some search results indicate that it's possible that other disks in the array have read errors and might prevent rebuilding for RAID 5. I don't know if there are read errors, and if it's read errors, I think it would mean that these errors would have to affect just the disk which is mirroring the disk that failed, this being a RAID 1+0. But if the RAID is striped across all the disks, that could be any or all of them.

The array is still in production and still works, so it should just rebuild. Now the plan is to use another 8TB disk once it arrives, make a new RAID 1 with the two new disks and copy the data over. The remaining 4TB disks can then be used to make a new array.

Learn from this that it can be a bad idea to use a RAID 0 for backups and that least one generation of backups must be on redundant storage ...

Just checked on one of my HP boxes, you can indeed not figure out if one of the discs has read errors. Do you have the option to reboot the box and check on the controller directly?

Thanks! The controller (it's BIOS) doesn't show up during boot, so I can't check there for errors.

The controller is extremely finicky: The plan to make a RAID 1 from the two new drives has failed because the array with the failed drive is unusable when the failed is missing entirely.

In the process of moving the 8TB drives back and forth, it turned out that when an array that was made from them is missing one drive, that array is unusable --- and when putting the missing drive is put back in, the array remains 'Ready for Rebuild' without the rebuild starting. There is also no way to delete an array that is missing a drive.

So the theory that the array isn't being rebuilt because other disks have errors is likely wrong. That means that whenenver a disk fails and is being replaced, there is no way to rebuild the array (unless it would happen automatically, which it doesn't).

With this experience, these controllers are now deprecated. RAID controllers that can't rebuild an array after a disk has failed and has been replaced are virtually useless.

Thomas Bendler

10:34 a.m.

Am Mi., 11. Nov. 2020 um 07:28 Uhr schrieb hw hw@gc-24.de:

...

[...] With this experience, these controllers are now deprecated. RAID controllers that can't rebuild an array after a disk has failed and has been replaced are virtually useless. [...]

HW RAID is often delivered with quite limited functionality. Because of this I switched in most cases to software RAID meanwhile and configured the HW RAID as JBOD. The funny thing is, when you use the discs previously used in the HW RAID in such a scenario, the software RAID detects them as RAID disks. It looks like a significant amount of HW RAID controllers use the Linux software RAID code in their firmware.

Kind regards Thomas

-- Linux ... enjoy the ride!

9:01 p.m.

On Wed, 2020-11-11 at 11:34 +0100, Thomas Bendler wrote:

...

Am Mi., 11. Nov. 2020 um 07:28 Uhr schrieb hw hw@gc-24.de:

...
[...] With this experience, these controllers are now deprecated. RAID controllers that can't rebuild an array after a disk has failed and has been replaced are virtually useless. [...]

HW RAID is often delivered with quite limited functionality. Because of this I switched in most cases to software RAID meanwhile and configured the HW RAID as JBOD. The funny thing is, when you use the discs previously used in the HW RAID in such a scenario, the software RAID detects them as RAID disks. It looks like a significant amount of HW RAID controllers use the Linux software RAID code in their firmware.

I have yet to see software RAID that doesn't kill the performance. And where do you get cost-efficient cards that can do JBOD? I don't have any.

It turned out that the controller does not rebuild the array even with a disk that is the same model and capacity as the others. What has HP been thinking?

Warren Young

11:38 p.m.

On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...

I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

...

And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

...

What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

John Pierce

12 Nov 12 Nov

midnight

On Wed, Nov 11, 2020 at 3:38 PM Warren Young warren@etr-usa.com wrote:

...

On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

the only 'advantage' hardware raid has is write-back caching.

with ZFS you can get much the same performance boost out of a small fast SSD used as a ZIL / SLOG.

-- -john r pierce recycling used bits in santa cruz

Valeri Galtsev

1:37 a.m.

...

On Nov 11, 2020, at 6:00 PM, John Pierce jhn.pierce@gmail.com wrote:

On Wed, Nov 11, 2020 at 3:38 PM Warren Young warren@etr-usa.com wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

the only 'advantage' hardware raid has is write-back caching.

Just for my information: how do you map failed software RAID drive to physical port of, say, SAS-attached enclosure. I’d love to hot replace failed drives in software RAIDs, have over hundred physical drives attached to a machine. Do not criticize, this is box installed by someone else, I have “inherited” it.To replace I have to query drive serial number, power off the machine and pulling drives one at a time read the labels...

With hardware RAID that is not an issue, I always know which physical port failed drive is in. And I can tell controller to “indicate” specific drive (it blinks respective port LED). Always hot replacing drives in hardware RAIDs, no one ever knows it has been done. And I’d love to deal same way with drives in software RAIDs.

Thanks for advises in advance. And my apologies for “stealing the thread"

Valeri

...

with ZFS you can get much the same performance boost out of a small fast SSD used as a ZIL / SLOG.

-- -john r pierce recycling used bits in santa cruz _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Warren Young

2:04 a.m.

On Nov 11, 2020, at 6:37 PM, Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...

how do you map failed software RAID drive to physical port of, say, SAS-attached enclosure.

With ZFS, you set a partition label on the whole-drive partition pool member, then mount the pool with something like “zpool mount -d /dev/disk/by-partlabel”, which then shows the logical disk names in commands like “zpool status” rather than opaque “/dev/sdb3” type things.

It is then up to you to assign sensible drive names like “cage-3-left-4” for the 4th drive down on the left side of the third drive cage. Or, maybe your organization uses asset tags, so you could label the disk the same way, “sn123456”, which you find by looking at the front of each slot.

Warren Young

2:10 a.m.

On Nov 11, 2020, at 7:04 PM, Warren Young warren@etr-usa.com wrote:

...

zpool mount -d /dev/disk/by-partlabel

Oops, I’m mixing the zpool and zfs commands. It’d be “zpool import”.

And you do this just once: afterward, the automatic on-boot import brings the drives back in using the names they had before, so when you’ve got some low-skill set of remote hands in front of the machine, and you’re looking at a failure indication in zpool status, you just say “Swap out the drive in the third cage, left side, four slots down.”

John Pierce

2:04 a.m.

in large raids, I label my disks with the last 4 or 6 digits of the drive serial number (or for SAS disks, the WWN). this is visible via smartctl, and I record it with the zpool documentation I keep on each server (typically a text file on a cloud drive). zpools don't actually care WHAT slot a given pool member is in, you can shut the box down, shuffle all the disks, boot back up and find them all and put them back in the pool.

the physical error reports that proceed a drive failure should list the drive identification beyond just the /dev/sdX kind of thing, which is subject to change if you add more SAS devices.

I once researched what it would take to implement the drive failure lights on the typical brand name server/storage chassis, there's a command for manipulating SES devices such as those lights, the catch is figuring out the mapping between the drives and lights, its not always evident, so would require trial and error.

On Wed, Nov 11, 2020 at 5:37 PM Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...

...
On Nov 11, 2020, at 6:00 PM, John Pierce jhn.pierce@gmail.com wrote:

On Wed, Nov 11, 2020 at 3:38 PM Warren Young warren@etr-usa.com wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC

CPU

...
...
typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet

links, a

...
...
much tougher task than mediating spinning disk I/O.

the only 'advantage' hardware raid has is write-back caching.

Just for my information: how do you map failed software RAID drive to physical port of, say, SAS-attached enclosure. I’d love to hot replace failed drives in software RAIDs, have over hundred physical drives attached to a machine. Do not criticize, this is box installed by someone else, I have “inherited” it.To replace I have to query drive serial number, power off the machine and pulling drives one at a time read the labels...

With hardware RAID that is not an issue, I always know which physical port failed drive is in. And I can tell controller to “indicate” specific drive (it blinks respective port LED). Always hot replacing drives in hardware RAIDs, no one ever knows it has been done. And I’d love to deal same way with drives in software RAIDs.

Thanks for advises in advance. And my apologies for “stealing the thread"

Valeri

...
with ZFS you can get much the same performance boost out of a small fast SSD used as a ZIL / SLOG.

-- -john r pierce recycling used bits in santa cruz _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

-- -john r pierce recycling used bits in santa cruz

Valeri Galtsev

2:30 a.m.

...

On Nov 11, 2020, at 8:04 PM, John Pierce jhn.pierce@gmail.com wrote:

in large raids, I label my disks with the last 4 or 6 digits of the drive serial number (or for SAS disks, the WWN). this is visible via smartctl, and I record it with the zpool documentation I keep on each server (typically a text file on a cloud drive).

I get info about software RAID failure from cronjob executing raid-check (coming with mdadm rpm). I can get S/N of failed drive (they are not dead-dead, still one query one) using smartctl, but I am too lazy to have all serial numbers of drives printed and affixed to fronts of drive trays… but so far I see no other way ;-(

Valeri

...

zpools don't actually care WHAT slot a given pool member is in, you can shut the box down, shuffle all the disks, boot back up and find them all and put them back in the pool.

the physical error reports that proceed a drive failure should list the drive identification beyond just the /dev/sdX kind of thing, which is subject to change if you add more SAS devices.

I once researched what it would take to implement the drive failure lights on the typical brand name server/storage chassis, there's a command for manipulating SES devices such as those lights, the catch is figuring out the mapping between the drives and lights, its not always evident, so would require trial and error.

On Wed, Nov 11, 2020 at 5:37 PM Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...
...
On Nov 11, 2020, at 6:00 PM, John Pierce jhn.pierce@gmail.com wrote:

On Wed, Nov 11, 2020 at 3:38 PM Warren Young warren@etr-usa.com wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC

CPU

...
...
typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet

links, a

...
...
much tougher task than mediating spinning disk I/O.

the only 'advantage' hardware raid has is write-back caching.

Just for my information: how do you map failed software RAID drive to physical port of, say, SAS-attached enclosure. I’d love to hot replace failed drives in software RAIDs, have over hundred physical drives attached to a machine. Do not criticize, this is box installed by someone else, I have “inherited” it.To replace I have to query drive serial number, power off the machine and pulling drives one at a time read the labels...

With hardware RAID that is not an issue, I always know which physical port failed drive is in. And I can tell controller to “indicate” specific drive (it blinks respective port LED). Always hot replacing drives in hardware RAIDs, no one ever knows it has been done. And I’d love to deal same way with drives in software RAIDs.

Thanks for advises in advance. And my apologies for “stealing the thread"

Valeri

...
with ZFS you can get much the same performance boost out of a small fast SSD used as a ZIL / SLOG.

-- -john r pierce recycling used bits in santa cruz _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

-- -john r pierce recycling used bits in santa cruz _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Simon Matter

7:13 a.m.

...

...
On Nov 11, 2020, at 6:00 PM, John Pierce jhn.pierce@gmail.com wrote:

On Wed, Nov 11, 2020 at 3:38 PM Warren Young warren@etr-usa.com wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

the only 'advantage' hardware raid has is write-back caching.

Just for my information: how do you map failed software RAID drive to physical port of, say, SAS-attached enclosure. I’d love to hot replace failed drives in software RAIDs, have over hundred physical drives attached to a machine. Do not criticize, this is box installed by someone else, I have “inherited” it.To replace I have to query drive serial number, power off the machine and pulling drives one at a time read the labels...

There are different methods depending on how the disks are attached. In some cases you can use a tool to show the corresponding disk or slot. Otherwise, once you have hot removed the drive from the RAID, you can either dd to the broken drive or make some traffic on the still working RAID and you'll see the disk immediately when looking at the disks busy LEDs.

I've used Linux Software RAID during the last two decades and it has always worked nicely while I started to hate hardware RAID more and more. Now with U.2 NVMe SSD drives, at least when we started using them, there were no RAID controllers available at all. And performance with Linux Software RAID1 on AMD EPYC boxes is amazing :-)

Regards, Simon

Valeri Galtsev

1:46 a.m.

...

On Nov 11, 2020, at 5:38 PM, Warren Young warren@etr-usa.com wrote:

On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

I’m sure you can reflash LSI card to make it SATA or SAS HBA, or MegaRAD hardware RAID adapter. Is far as I recollect it is the same electronics board. I reflashed a couple of HBAs to make them MegaRAID boards.

One thing though bothers me about LSI, now after last it was bought by Intel its future faith worries me. Intel pushed 3ware which it acquired in the same package with LSI already into oblivion…

Valeri

...

...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

John Pierce

2:07 a.m.

On Wed, Nov 11, 2020 at 5:47 PM Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...

I’m sure you can reflash LSI card to make it SATA or SAS HBA, or MegaRAD hardware RAID adapter. Is far as I recollect it is the same electronics board. I reflashed a couple of HBAs to make them MegaRAID boards.

you can reflash SOME megaraid cards to put them in IT 'hba' mode, but not others.

...

One thing though bothers me about LSI, now after last it was bought by Intel its future faith worries me. Intel pushed 3ware which it acquired in the same package with LSI already into oblivion…

Its Avago, formerly Aligent, and before that HP, which bought LSI, 3Ware, and then Broadcom, and renamed itself Broadcom.

-- -john r pierce recycling used bits in santa cruz

Valeri Galtsev

2:19 a.m.

...

On Nov 11, 2020, at 8:07 PM, John Pierce jhn.pierce@gmail.com wrote:

On Wed, Nov 11, 2020 at 5:47 PM Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...
I’m sure you can reflash LSI card to make it SATA or SAS HBA, or MegaRAD hardware RAID adapter. Is far as I recollect it is the same electronics board. I reflashed a couple of HBAs to make them MegaRAID boards.

you can reflash SOME megaraid cards to put them in IT 'hba' mode, but not others.

...
One thing though bothers me about LSI, now after last it was bought by Intel its future faith worries me. Intel pushed 3ware which it acquired in the same package with LSI already into oblivion…

Its Avago, formerly Aligent, and before that HP, which bought LSI, 3Ware, and then Broadcom, and renamed itself Broadcom.

I am apparently wrong, at least about LSI, it still belongs to broadcom, thanks!

Long before broadcom acquired LSI and 3ware, I was awfully displeased by their WiFi chip: infamous BCM43xx. It is 32 bit chip sitting on 64 bit bus. No [sane] open source programmer will be happy to write driver for that. For ages we were using ndis wrapper…. As much as I disliked broadcom for their wireless chipset, I loved them for their ethernet one. And I recollect this was long ago before acquisition by broadcom of LSI and 3ware. Or am I wrong?

Valeri

...

-- -john r pierce recycling used bits in santa cruz _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Valeri Galtsev

1:51 a.m.

...

On Nov 11, 2020, at 5:38 PM, Warren Young warren@etr-usa.com wrote:

On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

I’d rather have distributed redundant storage on multiple machines… but I still have [mostly] hardware RAIDs ;-)

Valeri

...

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

14 Nov 14 Nov

12:56 p.m.

On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...

On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

I'm currently using it, and the performance sucks. Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.

...

Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

It doesn't matter what I expect.

...

...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

...

Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

Do you have a reference for that, like a final statement from HP? Did they stop developing RAID controllers, or do they ship their servers now without them and tell customers to use btrfs or mdraid?

John Pierce

3:11 p.m.

On Sat, Nov 14, 2020, 4:57 AM hw hw@gc-24.de wrote:

...

On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

That specific card is a bad choice, it's the very obsolete SAS1068E chip, which was SAS 1.0, with max 2gb per disk.

Cards based on the SAS 2008, 2308, and 3008 chips are a much better choice.

Any oem card with these chips can be flashed with generic LSI/Broadcom IT firmware.

15 Nov 15 Nov

2:32 a.m.

On Sat, 2020-11-14 at 07:11 -0800, John Pierce wrote:

...

On Sat, Nov 14, 2020, 4:57 AM hw hw@gc-24.de wrote:

...
On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

That specific card is a bad choice, it's the very obsolete SAS1068E chip, which was SAS 1.0, with max 2gb per disk.

Thanks! That's probably why it isn't so expensive.

...

Cards based on the SAS 2008, 2308, and 3008 chips are a much better choice.

Any oem card with these chips can be flashed with generic LSI/Broadcom IT firmware.

I don't like the idea of flashing one. I don't have the firmware and I don't know if they can be flashed with Linux. Aren't there any good --- and cost efficient --- ones that do JBOD by default, preferably including 16-port cards with mini-SAS connectors?

John Pierce

3:31 a.m.

On Sat, Nov 14, 2020 at 6:32 PM hw hw@gc-24.de wrote:

...

I don't like the idea of flashing one. I don't have the firmware and I don't know if they can be flashed with Linux. Aren't there any good --- and cost efficient --- ones that do JBOD by default, preferably including 16-port cards with mini-SAS connectors?

the firmware is freely downloadable from lsi/broadcom, and linux has sas2flash or sas3flash (for 2x08/3008 respectively) command line tools to do the flashing.

pretty much standard procedure for the ZFS crowd to flash those... the 2x08 cards often come with "IR" firmware that does limited raid, and its preferable to flash them with the IT firmware that puts them in plain HBA mode, stands for Initiator-Target.

-- -john r pierce recycling used bits in santa cruz

Simon Matter

14 Nov 14 Nov

5:55 p.m.

...

On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

I'm currently using it, and the performance sucks. Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.

...
Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

It doesn't matter what I expect.

...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

...
Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

Do you have a reference for that, like a final statement from HP? Did they stop developing RAID controllers, or do they ship their servers now without them and tell customers to use btrfs or mdraid?

HPE and the other large vendors won't tell you directly because they love to sell you their outdated SAS/SATA Raid stuff. They were quite slow to introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also clear to them that NVMe is the future and that it's used with software redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's 4AA4-7186ENW.pdf file which also mentions it.

In fact local storage was one reason why we turned away from HPE and Dell after many years because we just didn't want to invest in outdated technology.

Regards, Simon

15 Nov 15 Nov

2:45 a.m.

On Sat, 2020-11-14 at 18:55 +0100, Simon Matter wrote:

...

...
On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

I'm currently using it, and the performance sucks. Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.

...
Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

It doesn't matter what I expect.

...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

...
Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

Do you have a reference for that, like a final statement from HP? Did they stop developing RAID controllers, or do they ship their servers now without them and tell customers to use btrfs or mdraid?

HPE and the other large vendors won't tell you directly because they love to sell you their outdated SAS/SATA Raid stuff. They were quite slow to introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also clear to them that NVMe is the future and that it's used with software redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's 4AA4-7186ENW.pdf file which also mentions it.

In fact local storage was one reason why we turned away from HPE and Dell after many years because we just didn't want to invest in outdated technology.

I'm currently running an mdadm raid-check and two RAID-1 arrays, and the server shows 2 processes with 24--27% CPU each and two others around 5%. And you want to tell me that the CPU load is almost non-existent.

I've also constantly seen much better performance with hardware RAID than with software RAID over the years and ZFS having the worst performance of anything, even with SSD caches.

It speaks for itself, and, like I said, I have yet to see a software RAID that doesn't bring the performance down. Show me one that doesn't.

Are there any hardware RAID controllers designed for NVMe storage you could use to compare software RAID with? Are there any ZFS or btrfs hardware controllers you could compare with?

Valeri Galtsev

3:43 a.m.

...

On Nov 14, 2020, at 8:45 PM, hw hw@gc-24.de wrote:

On Sat, 2020-11-14 at 18:55 +0100, Simon Matter wrote:

...
...
On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

I'm currently using it, and the performance sucks. Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.

...
Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.

It doesn't matter what I expect.

...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

...
Search for “LSI JBOD” for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead.

...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

Do you have a reference for that, like a final statement from HP? Did they stop developing RAID controllers, or do they ship their servers now without them and tell customers to use btrfs or mdraid?

HPE and the other large vendors won't tell you directly because they love to sell you their outdated SAS/SATA Raid stuff. They were quite slow to introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also clear to them that NVMe is the future and that it's used with software redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's 4AA4-7186ENW.pdf file which also mentions it.

In fact local storage was one reason why we turned away from HPE and Dell after many years because we just didn't want to invest in outdated technology.

I'm currently running an mdadm raid-check and two RAID-1 arrays, and the server shows 2 processes with 24--27% CPU each and two others around 5%. And you want to tell me that the CPU load is almost non-existent.

Hardware vs software RAID discussion is like a clash of two different religions. I, BTW, on your religious side: hardware RAID. For different reason: in hardware RAID it is small piece of code (hence well debugged), and dedicated hardware. Thus, things like kernel panic (of the main system, the one that would be running software RAID) does not affect hardware RAID function, whereas software RAID function will not be fulfilled in case of kernel panic. Whereas unclean filesystem can be dealt with, “unclean” RAID pretty much can not.

But again, it is akin religion, and after both sides shoot out all their ammunition, everyone returns back being still on the same side one was before the “discussion”.

So, I would just suggest… Hm, never mind, everyone, do what you feel right ;-)

Valeri

...

I've also constantly seen much better performance with hardware RAID than with software RAID over the years and ZFS having the worst performance of anything, even with SSD caches.

It speaks for itself, and, like I said, I have yet to see a software RAID that doesn't bring the performance down. Show me one that doesn't.

Are there any hardware RAID controllers designed for NVMe storage you could use to compare software RAID with? Are there any ZFS or btrfs hardware controllers you could compare with?

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Warren Young

14 Nov 14 Nov

9:37 p.m.

On Nov 14, 2020, at 5:56 AM, hw hw@gc-24.de wrote:

...

On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

I'm currently using it, and the performance sucks.

Be specific. Give chip part numbers, drivers used, whether this is on-board software RAID or something entirely different like LVM or MD RAID, etc. For that matter, I don’t even see that you’ve identified whether this is CentOS 6, 7 or 8. (I hope it isn't older!)

...

Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.

Sure you can. Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays.

In a 2-disk array, a proper software RAID system should give 2x a single disk’s performance for both read and write in RAID-0, but single-disk write performance for RAID-1.

Such values should scale reasonably as you add disks: RAID-0 over 8 disks gives 8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc.

These are rough numbers, but what you’re looking for are failure cases where it’s 1x a single disk for read or write. That tells you there’s a bottleneck or serialization condition, such that you aren’t getting the parallel I/O you should be expecting.

...

...
Why would you expect that a modern 8-core Intel CPU would impede I/O

It doesn't matter what I expect.

It *does* matter if you know what the hardware’s capable of.

TLS is a much harder problem than XOR checksumming for traditional RAID, yet it imposes [approximately zero][1] performance penalty on modern server hardware, so if your CPU can fill a 10GE pipe with TLS, then it should have no problem dealing with the simpler calculations needed by the ~2 Gbit/sec flat-out max data rate of a typical RAID-grade 4 TB spinning HDD.

Even with 8 in parallel in the best case where they’re all reading linearly, you’re still within a small multiple of the Ethernet case, so we should still expect the software RAID stack not to become CPU-bound.

And realize that HDDs don’t fall into this max data rate case often outside of benchmarking. Once you start throwing ~5 ms seek times into the mix, the CPU’s job becomes even easier.

[1]: https://stackoverflow.com/a/548042/142454

...

...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

You asked for “cost-efficient,” which I took to be a euphemism for “cheapest thing that could possibly work.”

If you’re willing to spend money, then I fully expect you can find JBOD cards you’ll be happy with.

Personally, I get servers with enough SFF-8087 SAS connectors on them to address all the disks in the system. I haven’t bothered with add-on SATA cards in years.

I use ZFS, so absolute flat-out benchmark speed isn’t my primary consideration. Data durability and data set features matter to me far more.

...

...
...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

Do you have a reference for that, like a final statement from HP?

Since I’m not posting from an hpe.com email address, I think it’s pretty obvious that that is my opinion, not an HP corporate statement.

I base it on observing the Linux RAID market since the mid-90s. The massive consolidation for hardware RAID is a big part of it. That’s what happens when a market becomes “mature,” which is often the step just prior to “moribund.”

...

Did they stop developing RAID controllers, or do they ship their servers now without them

Were you under the impression that HP was trying to provide you the best possible technology for all possible use cases, rather than make money by maximizing the ratio of cash in vs cash out?

Just because they’re serving it up on a plate doesn’t mean you hafta pick up a fork.

15 Nov 15 Nov

4:23 a.m.

On Sat, 2020-11-14 at 14:37 -0700, Warren Young wrote:

...

On Nov 14, 2020, at 5:56 AM, hw hw@gc-24.de wrote:

...
On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:

...
On Nov 11, 2020, at 2:01 PM, hw hw@gc-24.de wrote:

...
I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

I'm currently using it, and the performance sucks.

Be specific. Give chip part numbers, drivers used, whether this is on-board software RAID or something entirely different like LVM or MD RAID, etc. For that matter, I don’t even see that you’ve identified whether this is CentOS 6, 7 or 8. (I hope it isn't older!)

I don't need to be specific because I have seen the difference in practical usage over the last 20 years. I'm not setting up scientific testing environments that would cost tremendous amounts of money and am using available and cost-efficient hard- and software.

...

...
Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.

Sure you can. Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays.

No, I can't. I don't have tons of different CPUs, mainboards, controller cards and electronic diagnosting equipment around to do that, and what would you even benchmark? Is the user telling you that the software they are using in a VM that is stored on an NFS server, run by another server connected to it, is now running faster or slower? Are you doing SQL queries to create reports that are rarely required and take a while to run your benchmark? And what is even relevant?

I am seeing that a particular software running in a VM is now running not any slower and maybe even faster than before the failed disk was replaced. That means hardware RAID with 8 disks in hardware RAID 1+0 vs. two disks as RAID 0 each in software RAID, using the otherwise same hardware, is not faster and even slower than the software RAID. The CPU load on the storage server is also higher, which in this case does not matter. I'm happy with the result so far, and that is what matters.

If the disks were connected to the mainboard instead, the software might be running slower. I can't benchmark that, either, because I can't connect the disks to the SATA ports on the board. If there were 8 disks in a RAID 1+0, all connected to the board, it might be a lot slower. I can't benchmark that, the board doesn't have so many SATA connectors.

I only have two new disks and no additional or different hardware. Telling me to specify particular chips and such is totally pointless. Benchmarking is not feasible and pointless, either.

Sure you can do some kind of benchmarking in a lab if you can afford it, but how does that correlate to the results you'll be getting in practise? Even if you involve users, those users will be different from the users I'm dealing with.

...

In a 2-disk array, a proper software RAID system should give 2x a single disk’s performance for both read and write in RAID-0, but single-disk write performance for RAID-1.

Such values should scale reasonably as you add disks: RAID-0 over 8 disks gives 8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc.

These are rough numbers, but what you’re looking for are failure cases where it’s 1x a single disk for read or write. That tells you there’s a bottleneck or serialization condition, such that you aren’t getting the parallel I/O you should be expecting.

And?

...

...
...
Why would you expect that a modern 8-core Intel CPU would impede I/O

It doesn't matter what I expect.

It *does* matter if you know what the hardware’s capable of.

I can expect a hardware to do something as much as I want, it will always only do whatever it does regardless.

...

TLS is a much harder problem than XOR checksumming for traditional RAID, yet it imposes [approximately zero][1] performance penalty on modern server hardware, so if your CPU can fill a 10GE pipe with TLS, then it should have no problem dealing with the simpler calculations needed by the ~2 Gbit/sec flat-out max data rate of a typical RAID-grade 4 TB spinning HDD.

Even with 8 in parallel in the best case where they’re all reading linearly, you’re still within a small multiple of the Ethernet case, so we should still expect the software RAID stack not to become CPU-bound.

And realize that HDDs don’t fall into this max data rate case often outside of benchmarking. Once you start throwing ~5 ms seek times into the mix, the CPU’s job becomes even easier.

This may all be nice and good in theory. In practise, I'm seeing up to 30% CPU during a mdraid resync for a single 2-disk array. How much performance impact does that indicate for "normal" operations?

...

...
...
...
And where do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware?

And are these good?

You asked for “cost-efficient,” which I took to be a euphemism for “cheapest thing that could possibly work.”

Buying crap tends not to be cost-efficient.

...

If you’re willing to spend money, then I fully expect you can find JBOD cards you’ll be happy with.

Like $500+ cards? That's not cost efficient for my backup server I'm running about once a month to put backups on it. If I can get one good 16-port card or two 8-port cards for max. $100, I'll consider it. Otherwise, I can keep using the P410s, turn all disks into RAID0 and use btrfs.

...

Personally, I get servers with enough SFF-8087 SAS connectors on them to address all the disks in the system. I haven’t bothered with add-on SATA cards in years.

How do you get all these servers?

...

I use ZFS, so absolute flat-out benchmark speed isn’t my primary consideration. Data durability and data set features matter to me far more.

Well, I tried ZFS and was not happy with it, though it does have some nice features.

...

...
...
...
What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

Do you have a reference for that, like a final statement from HP?

Since I’m not posting from an hpe.com email address, I think it’s pretty obvious that that is my opinion, not an HP corporate statement.

I haven't payed attention to the email address.

...

I base it on observing the Linux RAID market since the mid-90s. The massive consolidation for hardware RAID is a big part of it. That’s what happens when a market becomes “mature,” which is often the step just prior to “moribund.”

...
Did they stop developing RAID controllers, or do they ship their servers now without them

Were you under the impression that HP was trying to provide you the best possible technology for all possible use cases, rather than make money by maximizing the ratio of cash in vs cash out?

Just because they’re serving it up on a plate doesn’t mean you hafta pick up a fork.

If they had stopped making hardware RAID controllers, that would show that they have turned away from hardware RAID, and that might be seen as putting an end to the discussion --- *because* they are trying to make money. If they haven't stopped making them, that might indicate that there is still sufficient demand for the technology, and there are probably good reasons for that. That different technologies have matured over time doesn't mean that others have become bad. Besides, always "picking up the best technology" comes with it's own disadvantages while all technology will ultimately fail eventually, and sometimes hardware RAID can be the "best technology".

1942

Age (days ago)

1952

Last active (days ago)

discuss@lists.centos.org

27 comments

6 participants

tags (0)

participants (6)

hw
John Pierce
Simon Matter
Thomas Bendler
Valeri Galtsev
Warren Young