On Thu, May 28, 2015 11:43 am, Gordon Messmer wrote: > On 05/28/2015 09:12 AM, Kirk Bocek wrote: >> I suggest everyone stay away from 3Ware from now on. > > My experience has been that 3ware cards are less reliable that software > RAID for a long, long time. It took me a while to convince my previous > employer to stop using them. Inability to migrate disk sets across > controller families, data corruption, and boot failures due to bad > battery daughter cards eventually proved my point. I have never seen a > failure due to software RAID, but I've seen quite a lot of 3ware > failures in the last 15 years. I strongly disagree with that. I have large number of 3ware based RAIDs in my server room. During last 13 years or so I have never had any failures or data losses of these RAIDs - not due to hardware (knocking on wood, I guess I should start calling myself lucky). Occasionally some researchers (who come mostly from places where they self-manage machines) bring stories of disaster/data losses. Each of them when I looked deep into details turn out to be purely due to not well configured hardware RAID in the first place. So, I end up telling them: before telling others that 3ware RAID cards are bad and let you down, check that what you set up does not contain obvious blunders. Let me list the major ones I have heard of: 1. Bad choice of drives for RAID. Any "green" (spin-down to conserve energy) drives are not suitable for RAID. Even drives that are not "spin-down" but poor quality, when they work in parallel (say 16 in single RAID unit) have much larger chance of more than one failing simultaneously. If you went as far as buying hardware RAID card, spend some 10% more on good drives (and buy them from good source), do not follow "price grabber". 2. Bad configuration of RAID itself. You need to run "verification" of the RAID every so often. My RAIDs are verified once a week. This will allow at least to force drives to scan the whole surface often, and discover and re-allocate bad blocks. If you don't do if for over a year you will have fair chance RAID failure due to several drives failing (because of accumulated never discovered bad blocks) accessing particular stripe... then you loose your RAID with its data. This is purely configuration mistake. 3. Smaller, yet still blunders: having card without memory battery backup and running RAID with the cache (in which case RAID device is much faster): if in this configuration you yank the power, you loose content of cache, and RAID quite likely will be screwed up big time. Of course, there are some restrictions, in particular, not always you can attach drives to different card model and have RAID keep functioning. 3ware cards usually discover that, and they export RAID read-only, so you can copy data elsewhere, then re-create RAID so it is compatible with internals of this new card. I do not want to start "software" vs "hardware" RAID wars here, but I really have to mention this: Software RAID function is implemented in the kernel. That means you have to have your system running for software RAID to fulfill its function. If you panic the kernel, software RAID is stopped in the middle of what it was doing and haven't done yet. Hardware RAID, to the contrary, does not need the system running. It is implemented in its embedded system, and all it need is a power. Embedded system is rather rudimentary and it runs only single rudimentary function: chops data flow and calculates RAID stripes. I've never heard this embedded system ever got panicked (which is result of its simplicity mainly). So, even though I'm strongly in favor of hardware RAID, I still consider one's choice just a matter of taste. And I would be much happier if software RAID people will have same attitude as well ;-) Just my $0.02 Valeri ++++++++++++++++++++++++++++++++++++++++ Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ++++++++++++++++++++++++++++++++++++++++