[CentOS] weird XFS problem

Sun Jan 22 22:36:09 UTC 2012
Boris Epstein <borepstein at gmail.com>

On Sun, Jan 22, 2012 at 1:34 PM, Keith Keller <
kkeller at wombat.san-francisco.ca.us> wrote:

> On 2012-01-22, Boris Epstein <borepstein at gmail.com> wrote:
> >
> > Also, here's somethine else I have discovered. Apparently there is an
> > potential intermittent RAID disk trouble. At least I found the following
> in
> > the system log:
> >
> > Jan 22 09:17:53 nrims-bs kernel: 3w-9xxx: scsi6: AEN: ERROR
> (0x04:0x0026):
> > Drive ECC error reported:port=4, unit=0.
> > Jan 22 09:17:53 nrims-bs kernel: 3w-9xxx: scsi6: AEN: ERROR
> (0x04:0x002D):
> > Source drive error occurred:port=4, unit=0.
>
> Which 3ware controller is this?  I have had lots of problems with the
> 3ware 9550SX controller and WD-EA[RD]S drives in a similar
> configuration.  (Yes, I know all about the EARS drives, but they work
> mostly fine with the 3ware 9650 controller, so I suspect some weird
> interaction between the cheap drives and the old not-so-great
> controller.  I also suspect an intermittently failing port, which I'll
> be testing more later this week.)
>
> > Jan 22 09:55:23 nrims-bs kernel: 3w-9xxx: scsi6: AEN: WARNING
> > (0x04:0x000F): SMART threshold exceeded:port=9.
> > Jan 22 09:55:23 nrims-bs kernel: 3w-9xxx: scsi6: AEN: WARNING
> > (0x04:0x000F): SMART threshold exceeded:port=9.
> > Jan 22 09:56:17 nrims-bs kernel: 3w-9xxx: scsi6: AEN: INFO (0x04:0x000B):
> > Rebuild started:unit=0.
>
> What does your RAID look like?  Are you using the 3ware's RAID6 (in
> which case it's not a 9550) or mdraid?  Are the 3ware errors in the logs
> across a large number of ports or just a few?  Have you used the drive
> tester for your drives to verify that they're still good?  On all my
> other systems, when the controller has reported a failure, and I've run
> it through the tester, it's reported a failure.  (Often when my 9550
> reports a failure the drive passes all tests.)
>
> If you happen to have real RAID drive models, you may also try
> contacting LSI support.  They will steadfastly refuse to help if you
> have desktop-edition drives, but can be at least somewhat helpful if you
> have enterprise drives.
>
> --keith
>
>
> --
> kkeller at wombat.san-francisco.ca.us
>
>
>
>
Keith, thanks!

The RAID is on the controller level. Yes, I believe the controller is a
3Ware 9xxx series - I don't recall the details right now.

What are you referring to as "drive tester"?

Boris.