[CentOS] Centos locking up system with mptscsih driver error

Thu Jan 5 12:59:09 UTC 2006
Johnny Hughes <mailing-lists at hughesjr.com>

On Thu, 2006-01-05 at 06:12 -0600, Johnny Hughes wrote:
> On Wed, 2006-01-04 at 23:43 -0700, Paul R. Ganci wrote:
> > Bryan J. Smith wrote:
> > 
> > >"Paul R. Ganci" <ganci at nurdog.com> wrote:
> > >  
> > >
> > >>Alas, I have the original BIOS installed. I was checking
> > >>and it does appear there are BIOS updates available.
> > >>
> > >Actually, the overwhelming majority of PowerNow/Cool'n Quiet
> > >issues I've seen are due to inappropriate APIC/ACPI/POST
> > >setup by the BIOS.  Nearly all BIOS updates fix these issues
> > >in my experience.
> > >
> > Well I did more research and have to clarify my statement. The installed 
> > BIOS is v4.05 which indeed was the original BIOS is actually only one 
> > version below the last BIOS provided by Tyan for the Tiger MPX 
> > (S2466N-4M). The new features and fixes for v4.06 are:
> > 
> > Fixes bios resetting during reboot issue.
> > Fixes hang on shutdown issue when no keyboard is present.
> > 
> > I find it hard to believe that flashing the BIOS to v4.06 is going to 
> > fix the mptscsi problem.
> > 
> > I did install 2.6.9-27.ELsmp. This kernel has the same problem as the 
> > 2.6.9-22.0.1.ELsmp. The actual error messages look like:
> > 
> > MPTSCSI: ioc0: attempting task abort! (sc=f24891c0)
> > SCSI: destination target 1, lun 0
> >     command = Test Unit Ready 00 00 00 00 00
> > MPTSCSI: ioc0: task abort: SUCCESS (sc=f24891c0)
> > 
> > These messages scroll up the monitor with the target drive and sc 
> > address changing. This system is only stable if I run 2.6.9-11.ELsmp. 
> > Unfortunately I don't find any other information in the log file related 
> > to the SCSI errrors indicated above ... the system clearly can't write 
> > to the drives and the raid5 array is corrupted. Again I find it hard to 
> > believe that the BIOS can be responsible since the system is so stable 
> > using kernel 2.6.9-11.ELsmp.
> > 
> > I have placed the latest dmesg log, an excerpt of the messages log and 
> > my grub.conf in http://www.nurdog.com/~ganci/crash/dmesg, 
> > http://www.nurdog.com/~ganci/crash/messages and 
> > http://www.nurdog.com/~ganci/crash/grub.conf respectively, in case 
> > someone would like to take a look. I would really like to get to the 
> > root of this problem and will be happy to provide any other information 
> > for anyone willing to help me debug this problem. In the meantime I will 
> > just run with 2.6.9-11.ELsmp.
> 
> Don't find it hard to believe ... try flashing the BIOS ... :)
> I have had problems like this many, many, many times.
> 
> They add lots of things besides what they list in BIOS ugrades.
> 
> It might  not fix it (I have no experience with this SPECIFIC board) but
> that is always the first thing I check.

I am using that same driver on a slightly different SCSI controller ...

scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

With no major issues at all.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.centos.org/pipermail/centos/attachments/20060105/8b8fa6e3/attachment-0005.sig>