On Sat, February 28, 2015 4:22 pm, Chris Murphy wrote: > On Sat, Feb 28, 2015 at 1:26 PM, Valeri Galtsev > <galtsev at kicp.uchicago.edu> wrote: >> Indeed. That is why: no LVMs in my server room. Even no software RAID. >> Software RAID relies on the system itself to fulfill its RAID function; >> what if kernel panics before software RAID does its job? Hardware RAID >> (for huge filesystems I can not afford to back up) is what only makes >> sense for me. RAID controller has dedicated processors and dedicated >> simple system which does one simple task: RAID. > > Biggest problem is myriad defaults aren't very well suited for > multiple device configurations. There are a lot of knobs in Linux and > on the drives and in hardware RAID cards. None of this is that simple. > > Drives, and hardware RAID cards are subject to firmware bugs, just as > we have software bugs in the kernel. We know firmware bugs cause > corruption. Speaking of which: Only good hardware cards are the ones I would use, and only good external RAID boxes. Over last decade and a half I never had trouble due to firmware bugs of RAIDs. What I use is: 1. 3ware (mostly) 2. LSI megaraid (a few, I don't like their user interface and poor notification abilities) 3. Areca (also a few, better UI than that of LSI) External RAID boxes: Infortrend I never will go for cheepy fake RAID (adaptec is one off the top of my head). Also, it was not my choice but I had to deal with Hm... not good external RAID boxes: by Promise, and by Raid.com to mention two. You are implying that firmware of hardware RAID cards is somehow buggier than software of software RAID plus Linux kernel (sorry if I misinterpreted your point). I disagree: embedded system of RAID card and RAID function they have to fulfill are much simpler than everything involved into software RAID. Therefore, with the same effort invested, firmware of (good) hardware is less buggy. And again, Linux kernel can be panicked more likely than trivial embedded system of hardware RAID card/box. At least my experience over decade and a half confirms that. I have heard horror stories from people who used the same good hardware I mentioned (3ware). However, when I went in each case deep into detail I discovered that they just didn't have all necessary set up correctly, which it trivial as a matter of fact. Namely: common mistake in all cases was: not setting RAID verify cron task (it is set on the RAID configuration level). I have my raids verified once a week. If you don't verify them for a year, what happens then: you don't discover individual drive degradation until it is too late and larger number than the level of redundancy are kicked out because of fatal failures. Even then 3ware when it is already not redundant doesn't kick out newly failing drives, just makes RAID read-only, so you still can salvage something. Anyway, these horror stories were purely poor sysadmin's job IMHO. > Not all hardware RAID cards are the same, some are total > junk. Many others get you vendor lock in due to proprietary metadata > written to the drives. You can't get your data off if the card dies, > you have to buy a similar model card sometimes with the same firmware > version in order to regain access. I would not consider that a disadvantage. I still have to see a 3ware card dead (yes, you can burn that if you plug it into slot with gross misalignment like tilt). And with 3ware, later model will accept drives originally making up RAID on older model, only it will make RAID read only, thus you can salvage your data first, then you can re-create RAID with this new card's (metadata standard). I guess, I may have different philosophy than you do. If I use RAID card, I choose indeed good one. Once I use the good one, I feel no need moving drives to card made by different manufacturer. And last, yet important thing: if you have to use these drives with different card (even just different model by the same manufacturer) then you better re-create RAID from scratch on this new card. If you value your data... Just my $0.02 Valeri ++++++++++++++++++++++++++++++++++++++++ Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ++++++++++++++++++++++++++++++++++++++++