On Jul 1, 2019, at 8:26 AM, Valeri Galtsev galtsev@kicp.uchicago.edu wrote:
RAID function, which boils down to simple, short, easy to debug well program.
RAID firmware will be harder to debug than Linux software RAID, if only because of easier-to-use tools.
Furthermore, MD RAID only had to be debugged once, rather that once per company-and-product line as with hardware RAID.
I hope you’re not assuming that hardware RAID has no bugs. It’s basically a dedicated CPU running dedicated software that’s difficult to upgrade.
if kernel (big and buggy code) is panicked, current RAID operation will never be finished which leaves the mess.
When was the last time you had a kernel panic? And of those times, when was the last time it happened because of something other than a hardware or driver fault? If it wasn’t for all this hardware doing strange things, the kernel would be a lot more stable. :)
You seem to be saying that hardware RAID can’t lose data. You’re ignoring the RAID 5 write hole:
https://en.wikipedia.org/wiki/RAID#WRITE-HOLE
If you then bring up battery backups, now you’re adding cost to the system. And then some ~3-5 years later, downtime to swap the battery, and more downtime. And all of that just to work around the RAID write hole.
Copy-on-write filesystems like ZFS and btrfs avoid the write hole entirely, so that the system can crash at any point, and the filesystem is always consistent.