[CentOS] Centos locking up system with mptscsih driver error

Thu Jan 5 13:07:57 UTC 2006
Johnny Hughes <mailing-lists at hughesjr.com>

On Thu, 2006-01-05 at 06:59 -0600, Johnny Hughes wrote:
> On Thu, 2006-01-05 at 06:12 -0600, Johnny Hughes wrote:
> > On Wed, 2006-01-04 at 23:43 -0700, Paul R. Ganci wrote:
> > > Bryan J. Smith wrote:
> > > 
> > > >"Paul R. Ganci" <ganci at nurdog.com> wrote:
> > > >  
> > > >
> > > >>Alas, I have the original BIOS installed. I was checking
> > > >>and it does appear there are BIOS updates available.
> > > >>
> > > >Actually, the overwhelming majority of PowerNow/Cool'n Quiet
> > > >issues I've seen are due to inappropriate APIC/ACPI/POST
> > > >setup by the BIOS.  Nearly all BIOS updates fix these issues
> > > >in my experience.
> > > >
> > > Well I did more research and have to clarify my statement. The installed 
> > > BIOS is v4.05 which indeed was the original BIOS is actually only one 
> > > version below the last BIOS provided by Tyan for the Tiger MPX 
> > > (S2466N-4M). The new features and fixes for v4.06 are:
> > > 
> > > Fixes bios resetting during reboot issue.
> > > Fixes hang on shutdown issue when no keyboard is present.
> > > 
> > > I find it hard to believe that flashing the BIOS to v4.06 is going to 
> > > fix the mptscsi problem.
> > > 
> > > I did install 2.6.9-27.ELsmp. This kernel has the same problem as the 
> > > 2.6.9-22.0.1.ELsmp. The actual error messages look like:
> > > 
> > > MPTSCSI: ioc0: attempting task abort! (sc=f24891c0)
> > > SCSI: destination target 1, lun 0
> > >     command = Test Unit Ready 00 00 00 00 00
> > > MPTSCSI: ioc0: task abort: SUCCESS (sc=f24891c0)
> > > 
> > > These messages scroll up the monitor with the target drive and sc 
> > > address changing. This system is only stable if I run 2.6.9-11.ELsmp. 
> > > Unfortunately I don't find any other information in the log file related 
> > > to the SCSI errrors indicated above ... the system clearly can't write 
> > > to the drives and the raid5 array is corrupted. Again I find it hard to 
> > > believe that the BIOS can be responsible since the system is so stable 
> > > using kernel 2.6.9-11.ELsmp.
> > > 
> > > I have placed the latest dmesg log, an excerpt of the messages log and 
> > > my grub.conf in http://www.nurdog.com/~ganci/crash/dmesg, 
> > > http://www.nurdog.com/~ganci/crash/messages and 
> > > http://www.nurdog.com/~ganci/crash/grub.conf respectively, in case 
> > > someone would like to take a look. I would really like to get to the 
> > > root of this problem and will be happy to provide any other information 
> > > for anyone willing to help me debug this problem. In the meantime I will 
> > > just run with 2.6.9-11.ELsmp.
> > 
> > Don't find it hard to believe ... try flashing the BIOS ... :)
> > I have had problems like this many, many, many times.
> > 
> > They add lots of things besides what they list in BIOS ugrades.
> > 
> > It might  not fix it (I have no experience with this SPECIFIC board) but
> > that is always the first thing I check.
> 
> I am using that same driver on a slightly different SCSI controller ...
> 
> scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
>         <Adaptec aic7899 Ultra160 SCSI adapter>
>         aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
> 
> With no major issues at all.

I looked at the tyan site and see that the SCSI controller is not built
on to the board ... so it is a standalone card?

Does it have the latest bios?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.centos.org/pipermail/centos/attachments/20060105/1f8ac1cb/attachment-0005.sig>