hi, after we switch our servers from centos-3 to centos-4 (aka. rhel-4) one of our server always crash once a week without any oops. this happneds with both the normal kernel-2.6.9-11.EL and kernel-2.6.9-11.106.unsupported. after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a core file. it seems there is a bug in the raid5 code of the kernel. this is our backup server with 8 x 200GB hdd in a raid5 (for the data) plus 2 x 40GB hdd in raid1 (for the system) with 3ware 8xxx raid contorller, running. i attached the netdump log of the last crash. how can i fix it? yours.
On 7/7/05, Farkas Levente lfarkas@bppiac.hu wrote:
hi, after we switch our servers from centos-3 to centos-4 (aka. rhel-4) one of our server always crash once a week without any oops. this happneds with both the normal kernel-2.6.9-11.EL and kernel-2.6.9-11.106.unsupported. after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a core file. it seems there is a bug in the raid5 code of the kernel. this is our backup server with 8 x 200GB hdd in a raid5 (for the data) plus 2 x 40GB hdd in raid1 (for the system) with 3ware 8xxx raid contorller, running. i attached the netdump log of the last crash. how can i fix it? yours.
Hi,
I have seen similar (but not quite the same) in the raid code on RHEL 3 kernels. They typically have occured due to a race condition between something updating the linked lists of raid devices and something trying to read them. For RHEL 3, my co-workes and I found where one particular race condition was fixed in 2.6 kernel and back ported to RHEL 3 kernel. Ultimately this patch was placed in one of the updates for the RHEL 3 kernel.
Anyway, it is likely your problem is yet another race condition. What I would suggest doing is get a box configured with true RHEL 4 and reproduce. Once reproduced file a bugzilla report with redhat. We have had very good success with this approach with a number of kernel bugs we found in the Centos 3/RHEL 3 kernels. Fixes have not always come quickly, but they generally do come.
Good Luck...james
--
On Thu, 2005-07-07 at 11:45 +0200, Farkas Levente wrote:
after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a core file. it seems there is a bug in the raid5 code of the kernel. this is our backup server with 8 x 200GB hdd in a raid5 (for the data) plus 2 x 40GB hdd in raid1 (for the system) with 3ware 8xxx raid contorller, running. i attached the netdump log of the last crash. how can i fix it?
It looks like you are using the 3Ware Escalade card as "just a bunch of disks" (JBOD) and not using hardware RAID-5. Is there any reason you're doing this? Kinda seems to defeat the purpose of the card?
Also be sure to always update the firmware of the card to the latest, and match it with the latest driver. Running an older firmware with a newer driver is never ideal.
-- Bryan
P.S. I agree with your design of a RAID-1 (or RAID-10) "system" and RAID-5 "data." I use 3Ware Escalade 8 and 12 channel cards all-the-time in this exact configuration: 2+6 and 4+8 for RAID-1/10 and RAID-5 on 8506-8 and 8506-12 cards, all respectively.
Bryan J. Smith wrote:
On Thu, 2005-07-07 at 11:45 +0200, Farkas Levente wrote:
after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a core file. it seems there is a bug in the raid5 code of the kernel. this is our backup server with 8 x 200GB hdd in a raid5 (for the data) plus 2 x 40GB hdd in raid1 (for the system) with 3ware 8xxx raid contorller, running. i attached the netdump log of the last crash. how can i fix it?
It looks like you are using the 3Ware Escalade card as "just a bunch of disks" (JBOD) and not using hardware RAID-5. Is there any reason you're doing this? Kinda seems to defeat the purpose of the card?
we use this since about 3 years. that day we read all kinf of docs and most people suggest to use in this way. faster (even you don't think so), easier, safer. just look trough the linux-raid list.
Also be sure to always update the firmware of the card to the latest, and match it with the latest driver. Running an older firmware with a newer driver is never ideal.
we always use the latest:-)
On Sat, 2005-07-09 at 20:52 +0200, Farkas Levente wrote:
we use this since about 3 years. that day we read all kinf of docs and most people suggest to use in this way.
I know they do. I just don't agree with it.
faster (even you don't think so),
Oh, it probably is faster with software, because the Linux kernel will buffer up writes far better than the small SRAM cache of the 3Ware Escalade 8506 (only 2MB in the -8, 4MB in the -12).
The lack of DRAM buffer is why 3Ware introduced the Escalade 9500S series -- now you get both several MBs of 0 wait state SRAM for the switching ASIC with a good amount of DRAM buffer for RAID-5.
easier,
Actually, I very much disagree with that assertion. What's easier than leaving everything -- build to failover to rebuild than to the on-board, intelligent ASIC? GPL drivers in the stock kernel since 2.2.15 (yes, that's _2.2_).
I'm not making this up, I have _numerous_ 3Ware Escalade 7000 series cards that I have been deployed since Red Hat Linux 6.x / kernel 2.2.x and have been upgraded through kernel 2.6 and I have changed _zilch_ except for maybe 1-2 firmware upgrades. Dealing with LVM and MD changes over the same period have been far more difficult.
I've also moved probably a dozen volumes from 6000 series to 7000 series in my time, and even one volume set from 8000 series to 9000 series more recently (although I have been avoiding the 9000 series because of others with reported issues -- typical of a new 3Ware series).
safer.
Again, totally disagree with that assertion. I'd rather leave RAID to a fairly static and proven firmware and driver in an intelligent, massively queuing design, which makes the OS/software merely a dumb block device that is hard to "screw up." ;->
Not to trample on your issues and kick you when you are down, but didn't you just have a problem? ;->
The _only_ RAID-5 issue I have _ever_ had with 3Ware was when they added it to the Escalade 6000 series. 3Ware quickly realized there was a design consideration in the 6000 that took issue with the RAID-5 algorithm, which prompted the 7000 series design (which is also used in the 8000 and 9000 too).
just look trough the linux-raid list.
Well, there's several things:
1. A lot of people throw all ATA RAID solutions into the same bucket, and don't recognize the difference with 3Ware.
2. Even those that do recognize 3Ware does use an on-board ASIC intelligence don't realize how well it queues and transfers blocks efficiently compared to traditional, "yesteryear" i960 designs in the Promise SuperTrak and Adaptec 2400A/2800A.
3. Some arguments I've heard say "well, I don't want the volume to be tied to the card" when 3Ware volumes are directly movable to newer card versions.
Yes, 3Ware cards (prior to the 9000 series) "suck" at RAID-5 writes, because the use a small amount of costly (transistor-wise) SRAM. But using SRAM also means it doesn't need battery backup either.
we always use the latest:-)
Just wondering why you're buying 3Ware cards when you're not using the hardware ASIC at all.
You'd be better off buying RAIDCore cards for the ATA channels if you're going to use LVM/MD for all RAID functionality.
The only time I use LVM with 3Ware is when I'm RAID-0 striping across two cards/volumes (on two separate PCI[-X] channels).
On Sat, 2005-07-09 at 15:46 -0500, Bryan J. Smith wrote:
Yes, 3Ware cards (prior to the 9000 series) "suck" at RAID-5 writes, because the use a small amount of costly (transistor-wise) SRAM. But using SRAM also means it doesn't need battery backup either.
Let me rephrase that ...
As long as power is delivered to the card, or has been delivered within the last few seconds, the data in the SRAM remains, and can be flushed. This is different than DRAM where you need both A) a logic that refreshes the data B) power to do A
The power usage of SRAM versus DRAM is orders of magnitude.
Now on the Escalade 9000 series, you do need battery backup to guarantee the DRAM is maintained between freezes and lock-ups.
Bryan J. Smith wrote:
easier,
Actually, I very much disagree with that assertion. What's easier than leaving everything -- build to failover to rebuild than to the on-board, intelligent ASIC? GPL drivers in the stock kernel since 2.2.15 (yes, that's _2.2_).
I'm not making this up, I have _numerous_ 3Ware Escalade 7000 series cards that I have been deployed since Red Hat Linux 6.x / kernel 2.2.x and have been upgraded through kernel 2.6 and I have changed _zilch_ except for maybe 1-2 firmware upgrades. Dealing with LVM and MD changes over the same period have been far more difficult.
what happends if the 3ware card get wrong? do you always have a backup raid controller (with the same type)? with software raid you can plug it to any kind of controller and save your data!
safer.
Again, totally disagree with that assertion. I'd rather leave RAID to a fairly static and proven firmware and driver in an intelligent, massively queuing design, which makes the OS/software merely a dumb block device that is hard to "screw up." ;->
Not to trample on your issues and kick you when you are down, but didn't you just have a problem? ;->
The _only_ RAID-5 issue I have _ever_ had with 3Ware was when they added it to the Escalade 6000 series. 3Ware quickly realized there was a design consideration in the 6000 that took issue with the RAID-5 algorithm, which prompted the 7000 series design (which is also used in the 8000 and 9000 too).
at the first place we start to use 3ware's raid5 when it's crashes at the first week and we got a mail from 3ware that it's a known issue with the current firmware. that was enough!
we always use the latest:-)
Just wondering why you're buying 3Ware cards when you're not using the hardware ASIC at all.
how you can plug 1.5TB into machine? and the only good kernel support is for 3ware (at least 2-3 years ago). that simple.
Farkas Levente wrote:
what happends if the 3ware card get wrong? do you always have a backup raid controller (with the same type)? with software raid you can plug it to any kind of controller and save your data!
Can't answer for Bryan, but:
How often you get raid controller card go wrong (or any other PCI card for that matter)? Sure, if you are careless when handling it and static electricity that accumulated in your body fries it. Those installed in production servers, very darn close to never.
Even with that, if you keep a number of servers running, you usually standardize on the cards you use (for example, you go with 3ware and stick with them, or you go with Adaptec and stick with them). Having a spare card or two in a drawer is a good idea anyhow, if you unexpectedly need to build the server overnight. And usually, you don't need to have exact same model of the card. In most cases, cards produced by one manufacturer can read RAID metadata information written by different card models (of the same manufacturer). So you can access your data as long as the replacement card has enough ports to connect all disks from failed card.
On Sat, 2005-07-09 at 16:58 -0500, Aleksandar Milivojevic wrote:
How often you get raid controller card go wrong (or any other PCI card for that matter)? Sure, if you are careless when handling it and static electricity that accumulated in your body fries it. Those installed in production servers, very darn close to never.
GPU and MACs (NIC ICs) tend to be very high density and under-cooled. They are solid state devices that tend to belly up in 3-5 years.
3Ware's ASIC, SRAM and ATA logic are rather low-density, and are always cold to the touch in my experience. Hence why I have _never_ seen the fail -- and I'm talking dozens upon dozens of cards deployed.
Even with that, if you keep a number of servers running, you usually standardize on the cards you use (for example, you go with 3ware and stick with them, or you go with Adaptec and stick with them).
Exactly. I _never_ recommend just 1 device. I'll standardize on 2+ servers or 2+ desktops with 3Ware cards. E.g., even at home, I have (2) 3Ware 7000 series and (4) 3Ware 6000 series.
I really hate it when I go into a department and they have a hodge- podge of servers -- it's just a horrendously poor sign of configuration management. Money is _never_ a factor, it's how you use it, and I've made do with far less than most.
Standardization is very key, even if it gets revised regularly or you even change vendors a couple of times. There are ways to deal with those situations.
Having a spare card or two in a drawer is a good idea anyhow, if you unexpectedly need to build the server overnight.
Well, when you're talking 4+ servers, you typically _should_ have a "test" server which is also your "standby" server. Right now in one department in the organization I'm working with, that is their #1 issue.
[ I tire of having to do "new" things on production servers. ]
And usually, you don't need to have exact same model of the card. In most cases, cards produced by one manufacturer can read RAID metadata information written by different card models (of the same manufacturer).
Not always. Adaptec has had a nasty habit of buying other solutions and ending up with 3-4 incompatible meta-data formats. Even LSI Logic has done this a few times too -- although far less because they are OEM focused, not a retail vendor that customizes for OEMs (like Adaptec).
I've found 3Ware to be a company that has produced only their own designs and meta-data format. They are also very, very good on stating when a firmware change does affect the meta-data format. Their newer firmware have _always_ allowed a newer firmware to read an older one. 3Ware's products are very proliferated.
While I understand the "risk" arguments made here, understand that over my past 5+ years of 3Ware deployments, my clients and I have undergone the _least_ risk thanx to 3Ware compared to other vendors and solutions. That's why I find most of the LVM/MD arguments that may be applicable to _some_ vendors to be wholly inapplicable to 3Ware.
So you can access your data as long as the replacement card has enough ports to connect all disks from failed card.
In the case of 3Ware, this has been 100% true.
On Sat, 2005-07-09 at 18:04, Bryan J. Smith wrote:
I really hate it when I go into a department and they have a hodge- podge of servers -- it's just a horrendously poor sign of configuration management. Money is _never_ a factor, it's how you use it, and I've made do with far less than most.
More likely it is a sign of age and unwillingness to replace equipment before it's useful life is over.
Standardization is very key, even if it gets revised regularly or you even change vendors a couple of times. There are ways to deal with those situations.
Even if you stay with the same vendor you usually can't buy the same design with the same parts for more than a year. Which means that unless you buy them the day they come out, your next order is likely to end up being something different. Out of the gazillion kinds of equipment out there you can come up with a few in hindsight that have survived, but it's really a matter of luck regarding what you bought and installed.
Having a spare card or two in a drawer is a good idea anyhow, if you unexpectedly need to build the server overnight.
That's the real trick. If you need something to work, keep a spare yourself, preferably a whole box that you can swap the drives from your production box into.
While I understand the "risk" arguments made here, understand that over my past 5+ years of 3Ware deployments, my clients and I have undergone the _least_ risk thanx to 3Ware compared to other vendors and solutions.
5 years isn't really that long a time in terms of business data, and I do recall seeing rumors in print a few years back that 3ware was going to stop producing the ide raid cards. They didn't and I'm not sure why, or why it was rumored that they would, but it still works out to a matter of luck every time you have to buy a replacement part and it still happens to be available.
On Sun, 2005-07-10 at 16:46 -0500, Les Mikesell wrote:
More likely it is a sign of age and unwillingness to replace equipment before it's useful life is over.
Not true. I've been in many departments that get "hand me downs." The difference is when a department/organization cares about consistency and/or configuration management and a department/organization that does not care about consistency and/or configuration management.
Simple rule I use, _always_ request at least 2 of the same "hand me down" systems. More often than not, a department/organization is unwisely distributing them one-to-a-customer, which is just ludicrous. I'd rather have 2 slower systems than 1 faster.
Even if you stay with the same vendor you usually can't buy the same design with the same parts for more than a year. Which means that unless you buy them the day they come out, your next order is likely to end up being something different.
Which is why you _always_ request at least 3 of the same systems when you buy new. By the time it is a "hand me down," you should have at least 2 working systems.
Out of the gazillion kinds of equipment out there you can come up with a few in hindsight that have survived, but it's really a matter of luck regarding what you bought and installed.
Not if you don't buy 1 unit at a time. Unfortunately, companies seem to not consider this.
That's the real trick. If you need something to work, keep a spare yourself, preferably a whole box that you can swap the drives from your production box into.
If I'm "white boxing," then I think that's an issue that should be more formally addressed. Again, this is not a money issue, some people just make it that. You make do with what you got, but you don't cross the lines of good consistency and configuration management.
If that means not putting a high-end, but single, "hand me down" into production, so be it. I'm more worried about downtime than the ultimate performance. ;->
5 years isn't really that long a time in terms of business data, and I do recall seeing rumors in print a few years back that 3ware was going to stop producing the ide raid cards.
The only thing 3Ware stopped producing was its short-lived iSCSI storage arrays.
And as far as 3Ware stopping producing the ATA RAID cards (e.g., the 7000 series), you can buy their newer SATA RAID cards (e.g., 8000 and 9000 series) and use their ATA-SATA converters. Not ideal, but they are officially supported by 3Ware.
But 3Ware is still very much producing the ATA RAID series (e.g., the 7000 series). The 6000 series became dead the second 3Ware realized that the RAID-5 firmware was never going to work correctly on it. They had already developed the 7000 series that massively improved RAID-5 performance.
Almost every single complaint I hear about 3Ware is the 6000 series and RAID-5, a series _never_ designed for RAID-5. BTW, I'm still using 6000 series cards for RAID-0, 1 and 10 without issue, some 6+ years after release.
They didn't and I'm not sure why, or why it was rumored that they would, but it still works out to a matter of luck every time you have to buy a replacement part and it still happens to be available.
3Ware is very well proliferated.
I really dislike when people "blanket assume" risk. I like to get very specific and when it comes to 3Ware solutions, you're talking a very proliferated set of products with _full_ backward compatibility in _every_ new version.
That's pretty low risk if you ask me. ;->
Bryan J. Smith wrote:
On Sun, 2005-07-10 at 16:46 -0500, Les Mikesell wrote:
More likely it is a sign of age and unwillingness to replace equipment before it's useful life is over.
Not true. I've been in many departments that get "hand me downs." The difference is when a department/organization cares about consistency and/or configuration management and a department/organization that does not care about consistency and/or configuration management.
Simple rule I use, _always_ request at least 2 of the same "hand me down" systems. More often than not, a department/organization is unwisely distributing them one-to-a-customer, which is just ludicrous. I'd rather have 2 slower systems than 1 faster.
Meanwhile... back to the subject. And I must apologize as I've not been following this thread, but did you use LVM for the partitioning scheme? I had big issues with LVM on a Compaq raid. I rarely ever use such things, but this was just a simple 'test' install and it was good to have been just a 'test' as the LVM 'test' failed. Disk druid fixed the issue. I did the install twice with LVM, both times failed (I don't remember at what point but it seems to not have made it to boot). Then did disk druid... and it's been fine on several like machines. So, I'm left thinking that at least some raid drivers/hardware might have issues with LVM.
John Hinton
Aleksandar Milivojevic wrote:
Farkas Levente wrote:
what happends if the 3ware card get wrong? do you always have a backup raid controller (with the same type)? with software raid you can plug it to any kind of controller and save your data!
Can't answer for Bryan, but:
How often you get raid controller card go wrong (or any other PCI card for that matter)? Sure, if you are careless when handling it and static electricity that accumulated in your body fries it. Those installed in production servers, very darn close to never.
Even with that, if you keep a number of servers running, you usually standardize on the cards you use (for example, you go with 3ware and stick with them, or you go with Adaptec and stick with them). Having a spare card or two in a drawer is a good idea anyhow, if you unexpectedly need to build the server overnight. And usually, you don't need to have exact same model of the card. In most cases, cards produced by one manufacturer can read RAID metadata information written by different card models (of the same manufacturer). So you can access your data as long as the replacement card has enough ports to connect all disks from failed card.
I can verify that if a 3Ware card fails, you can simply plug in another 3Ware card and it reads some magic bits from the array on startup and just plain works. I've done this with both 7X00 and 8X00 cards. However, a failed card is extremely rare. I think I've had exactly 2 fail (in a universe of 50-75 total cards) over the course of 5 or 6 years. It is FAR more likely that you'll lose a disk than the hardware RAID card. I can't imagine why the original poster is using the 3Ware card as a "dumb" IDE controller. That makes *zero* sense. It's at least as reliable (and maybe moreso) than the Linux software RAID bits (which is also pretty darned reliable).
Cheers,
C
On Sat, 2005-07-09 at 21:54 -0400, Chris Mauritz wrote:
It is FAR more likely that you'll lose a disk than the hardware RAID card.
Exactly. "On-line" backup (RAID, snapshots, etc...) is not a replacement for "off-line" backup (tape, RRD, maybe some HDs), although near-line (power/usage managed HD) is somewhat of an option (especially when not sending everything to tape, which is a good strategy today).
I can't imagine why the original poster is using the 3Ware card as a "dumb" IDE controller. That makes *zero* sense.
Well, he did say something I didn't think of, there was no real hot-swap support except for with a 3Ware card. But I want to say that RAIDCore (typically 8-channel ATA PCI cards that use their own FRAID driver(s)) also has hot-swap too.
It's at least as reliable (and maybe moreso) than the Linux software RAID bits (which is also pretty darned reliable).
Other than the 3Ware 6000 series, I have _never_ heard of an issue with RAID-5.
On Sat, 2005-07-09 at 23:40 +0200, Farkas Levente wrote:
what happends if the 3ware card get wrong? do you always have a backup raid controller
The issue is no different than with tape cartridges.
Yes. I _always_ deploy at least (2) 3Ware cards at any organization, so there is a fall-back in the case of disaster.
(with the same type)?
There is no such thing as "same type" with 3Ware.
As long as you put the volume in a device that is the same or newer firmware, you're fine.
BTW, I have deployed over 50 (five-zero) 3Ware cards and have _never_ had a failure of a device yet. Some are 4+ years old and still in use.
Plus I personally have an original 3Ware Escalade 7800 8-channel at home I have upgraded from Red Hat Linux 6.1 to Fedora Core 3 -- *0* non-sense with LVM/MD.
with software raid you can plug it to any kind of controller and save your data!
That's what backup is for! Geez. ;->
And not always, if they disagree over low-level format. I've had that happen with different SCSI host adapters from different vendors on more than one occasion (and even the same vendor -- OEM v. Retail Adaptec).
And when it comes to RAID-1, it's just a full disk mirror anyway.
at the first place we start to use 3ware's raid5 when it's crashes at the first week and we got a mail from 3ware that it's a known issue with the current firmware. that was enough!
That was the 3Ware Escalade 6000 series. 3Ware added RAID-5 support for the 6000 series when customers requested it. 3Ware regretted ever doing such.
_Never_ seen an issue on the Escalade 7000+ series.
On the 6000 series, I always use just RAID-1 or RAID-10. In many cases, RAID-5 (regardless of implementation) is _not_ fast enough, so I use RAID-10.
how you can plug 1.5TB into machine? and the only good kernel support is for 3ware (at least 2-3 years ago). that simple.
Hmmm, I thought RAIDCore's solution was integrated with LVM/MD? And they _do_ have hot-plug last time I checked.
Just seems like a massive waste of the ASIC.
although this mail create a long thread, but anybody has any good solution to the original question?
Farkas Levente wrote:
hi, after we switch our servers from centos-3 to centos-4 (aka. rhel-4) one of our server always crash once a week without any oops. this happneds with both the normal kernel-2.6.9-11.EL and kernel-2.6.9-11.106.unsupported. after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a core file. it seems there is a bug in the raid5 code of the kernel. this is our backup server with 8 x 200GB hdd in a raid5 (for the data) plus 2 x 40GB hdd in raid1 (for the system) with 3ware 8xxx raid contorller, running. i attached the netdump log of the last crash. how can i fix it? yours.
On Mon, 2005-07-11 at 10:14 +0200, Farkas Levente wrote:
although this mail create a long thread, but anybody has any good solution to the original question?
Nope. I looked through his netdump and couldn't find a thing that was helpful. Now he's clearly using MD (not LVM), and I'd use the standard HOWTO troubleshooting to try to revolve.
But, otherwise, the use of the 3Ware card for drives in JBOD and MD atop just making me scratch my head. I guess using it for hot-swap is somewhat of a reason, but it just seems like a waste.
BTW, now one thing I _have_ seen a lot of people complain about with 3Ware cards is when they were using software RAID-5 on them. Especially in the early 2.4.x series, now possibly the early 2.6.x as well -- all MD issues, not 3Ware card issues.