[CentOS] 3ware disk failure -> hang

Fri Jan 6 20:30:06 UTC 2006
Adam Gibson <agibson at ptm.com>

Joshua Baker-LePain wrote:
> On Fri, 6 Jan 2006 at 2:47pm, Adam Gibson wrote
> 
>> I have a similar problem with using the hardware raid(mirror).  Every 
>> time 3dmd started a scheduled verify at midnight... anywhere from 0 to 
>> 26 minutes later the kernel would crash.  This happened every night at 
>> 12.  I finally disabled the verify task in 3dmd and the crashes 
>> stopped.  I now just use smartd to do extended tests which do not show 
>> any problems with the disks. The crash dump and log indicates that 
>> port0 is bad though.
> 
> Did you have smartd set up to monitor the disks as well as 3dmd?  Did 
> you get the bad port error preceding every crash?
> 

At first I had smartd running as well which was scheduled much later in 
the morning away from the 3dmd verify.  I experimented by not running 
smartd but the crashes still occurred.

The port0 error was after every crash.

>> I have the crash dumps and it is reproducible if I enable verify 
>> again... Anyone know of a way to get to the bottom of the crash and 
>> find a fix?  I keep getting the feeling of "See... you should have 
>> bought RHEL to get support!".   Too expensive for my use of this 
>> system though.
> 
> There's always RH's bugzilla, but not if in you're in a hurry, and they 
> do seem to frown on centos derived bugs.
> 

I would not feel right trying to report this to RH bugzilla.  Is that 
something that has been done in the past?  I am just really surprised 
that RH has not found this problem on their own.  3ware controllers are 
used by a lot of users I would think.  To have a problem like this is a 
pretty big deal I would think.