I have a Tyan Tiger S2466 MPX motherboard with Dual Atlon MP 2800+ CPUs and 1GB PC2100 DDR SDRAM. For disk drive I have an LSI53C1030 and 4 Seagate ST336607LWs in a software raid 5 configuration. I installed Centos 4.1 and everything was fine running kernel-smp-2.6.9-11.EL. However when I upated to Centos 4.2 I have run into problems. Namely after a finite amount of disk traffic the system locks completely up. On the monitor console I get message after message from the MPT Fusion SCSI driver mptscsih indicating that there was a failure and that an "ABORT was successful". Unfortunately I don't have the exact message, but I have tracked the problem to kernel-smp-2.6.9-22.0.1.EL. Since I updated from 4.1 I still had the smp-2.6.9-11 kernel around and as long as I boot into kernel-smp-2.6.9-11.EL with all other 4.2 updates installed everything is stable. I can reliably get the mptscsih driver to fail after a few minutes of system uptime (or shorter time if doing disk writes) when booted into kernel-smp-2.6.9-22.0.1.EL.
I checked the Centos archives and did not find anything related to this SCSI card or motherboard. Now before I get lambasted for not having the exact SCSI error message (yes I am willing boot into kernel-smp-2.6.9-22.0.1.EL despite that my raid partition has to be rebuilt afterwards) to get the message. I was wondering if anyone else has had problems with this hardware/driver and/or kernel-smp-2.6.9-22.0.1.EL. If this is a new mptscsih problem I will post more details of my system but I thought I would start with just a general question in case I missed something.
On Tue, 2006-01-03 at 18:34 -0700, Paul R. Ganci wrote:
I have a Tyan Tiger S2466 MPX motherboard with Dual Atlon MP 2800+ CPUs and 1GB PC2100 DDR SDRAM. For disk drive I have an LSI53C1030 and 4 Seagate ST336607LWs in a software raid 5 configuration. I installed Centos 4.1 and everything was fine running kernel-smp-2.6.9-11.EL. However when I upated to Centos 4.2 I have run into problems. Namely after a finite amount of disk traffic the system locks completely up. On the monitor console I get message after message from the MPT Fusion SCSI driver mptscsih indicating that there was a failure and that an "ABORT was successful". Unfortunately I don't have the exact message, but I have tracked the problem to kernel-smp-2.6.9-22.0.1.EL. Since I updated from 4.1 I still had the smp-2.6.9-11 kernel around and as long as I boot into kernel-smp-2.6.9-11.EL with all other 4.2 updates installed everything is stable. I can reliably get the mptscsih driver to fail after a few minutes of system uptime (or shorter time if doing disk writes) when booted into kernel-smp-2.6.9-22.0.1.EL.
I checked the Centos archives and did not find anything related to this SCSI card or motherboard. Now before I get lambasted for not having the exact SCSI error message (yes I am willing boot into kernel-smp-2.6.9-22.0.1.EL despite that my raid partition has to be rebuilt afterwards) to get the message. I was wondering if anyone else has had problems with this hardware/driver and/or kernel-smp-2.6.9-22.0.1.EL. If this is a new mptscsih problem I will post more details of my system but I thought I would start with just a general question in case I missed something.
I have not seen this particular problem ... do you want to try the new 2.6.9-27.EL kernel that was released as part of EL4-u3beta
Also, verify you have the latest BIOS updates from you motherboard.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue, Jan 03, 2006 at 07:41:36PM -0600, Johnny Hughes wrote:
On Tue, 2006-01-03 at 18:34 -0700, Paul R. Ganci wrote: I have not seen this particular problem ... do you want to try the new 2.6.9-27.EL kernel that was released as part of EL4-u3beta
Do you have any idea if UDF Write support was backported into it ?
- -- Rodrigo Barbosa rodrigob@suespammers.org "Quid quid Latine dictum sit, altum viditur" "Be excellent to each other ..." - Bill & Ted (Wyld Stallyns)
On Wed, 2006-01-04 at 00:05 -0200, Rodrigo Barbosa wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue, Jan 03, 2006 at 07:41:36PM -0600, Johnny Hughes wrote:
On Tue, 2006-01-03 at 18:34 -0700, Paul R. Ganci wrote: I have not seen this particular problem ... do you want to try the new 2.6.9-27.EL kernel that was released as part of EL4-u3beta
Do you have any idea if UDF Write support was backported into it ?
Here is the changelog since the last kernel update (holy batdroppings batman) ... nothing in there about UDF that I see though:
* Wed Dec 21 2005 Jason Baron jbaron@redhat.com [2.6.9-27]
-re-build for gcc changes affecting the kernel-debuginfo package
* Wed Dec 21 2005 Jason Baron jbaro@redhat.com [2.6.9-26]
-nfs: fix return value setxattr() (Steve Dickson) [175812] -revert: speedup raw dio and aio paths (Geoff Gustafson) [167645] -fix security_ops kabi breakage (Jason Baron) [175680] -set ipv6 fragment id correctly (David Miller) [173118]
* Tue Dec 13 2005 Jason Baron jbaron@redhat.com [2.6.9-25]
-Keys: Permission checking fix for key update vs add (David Howells) [171705] -fix vmware mpt fusion regression (Mike Christie) [170985] -gfs: fix DirectIO Deadlock (Wendy Cheng) [173912] -Fix audit filtering on syscall failure (David Woodhouse) [175132] -Turn off PCI parity check by default (Alan Cox) -New range of userspace audit message types (David Woodhouse) [175415] -fix scsi_eh_tur retry logic (Pete Zaitcev) [175188 160308] -nfs: fix I/O stalls waiting on revalidation (Steve Dickson) [175236] -azx: update (John Linville) [171985 172129 172920]
* Sat Dec 10 2005 Jason Baron jbaron@redhat.com [2.6.9-24.1]
-revert: aic94xx SAS driver [165134] -restore diskdump for SATA (Peter Martuccelli) [175123] -ia64: include iomap.h in io.h (Prarit Bhargava) [174734] -update Emulex FC lpfc driver (Mike Christie) [149294 163150 175136 175142] -ia64: Fix improper initcall loading on sn arch (Prarit Bhargava) [173523 174985] -fix potential cifs data corrupter in 1.34a update (Steve Dickson) [170498] -x86_64: fix random single bit corruption (Jason Baron) [175128 175292] -sky2: add new driver (John Linville) [168246 171060] -qlogic aen handling and jiffies fix (Mike Christie) [174427] -prevent system deadlock (Larry Woodman) [174895 161101 162759] -cciss: ioctl fixes (Tom Coughlan) [168571] -e1000: restore 8086:1099 to PCI ID table (John Linville) [175092]
* Thu Dec 01 2005 Jason Baron jbaron@redhat.com [2.6.9-24]
-fix scsi delete timer race (Doug Ledford) [164629] -x86_64: include iomap.h in io.h (Peter Martuccelli) [174583] -fix powernow-k8 pending bit stuck (Brian Maly)
* Tue Nov 29 2005 Jason Baron jbaron@redhat.com [2.6.9-23]
-revert: fix memory mapped files not updating timestamps [173226]
* Thu Nov 24 2005 Jason Baron jbaron@redhat.com [2.6.9-22.27]
-revert: netpoll: avoid calling the napi poll routine recursiviely [146164]
* Thu Nov 24 2005 Jason Baron jbaron@redhat.com [2.6.9-22.26]
-ia64 diskdump using DUMP_EXCLUDE_FREE hangs when called from INIT (Keiichiro Tokunaga) [169519] -autofs4: fix broken expiry of negative dentries (Jeff Moyer) [172986] -netpoll: avoid calling the napi poll routine recursiviely (Jeff Moyer) [146164] -add dell_rbu driver for Dell BIOS image updates (Brian Maly) [170132] -fix Platform SMIs interfere with tsc based delay calibration (Brian Maly) [168811] -update sg_io verify_command list (Mike Christie) [158861] {CVE-2004-1190} -fix lost fput and sockfd_put could lead to DoS (Alexander Viro) [168659] -SATA update (Jeff Garzik) [131889 145061 166862 166880 169488] -cciss: driver updates (Tom Coughlan) [168571] -add aic94xx SAS driver (Jeff Garzik) [165134] -add ATI RN50 ids to drivers/video/aty (Nathan Lynch) -updated i82593 file license (John Linville) [172663] -fix offb crash on IBM ppc blade (Nathan Lynch) -ixgb: update to version 1.0.100-k2 (John Linville) [168502] -fix memory mapped files not updating timestamps (Peter Staubach) [173226] -Prevent sn2 code from executing on all ia64 platforms (Prarit Bhargava) [173354] -Fix for SystemTap- return probe on do_execve (Dave Anderson) [173304]
* Wed Nov 23 2005 Jason Baron jbaron@redhat.com [2.6.9-22.25]
-Wacom driver update (Kristian Høgsberg) [158842] -fix endless loops in HID on disconnect (Pete Zaitcev) [167070] -Make DVD-RAM writable on legacy iSeries VIOCD (David Howells) [168816] -fix erratic behaviour when system fd limit reached (Peter Staubach) [166524] -fix 32-bit program can hang x86_64 kernel (David Woodhouse) [168374] -update MPT Fusion driver (Mike Christie) [168414] -nfs: fix hangs with directio and aio using NFS (Steve Dickson) [161362] -Fix "No such file or directory" errors when using autofs w/ghosting (Jeff Moyer) [173194] -fix Diskdump fails through ipr driver (Nobuhiro Tachino) [159869] -fix copy correct number of opcode bytes in sg_scsi_ioctl (Mike Christie) [169402] -x8664: set NR_CPUS to 64
* Tue Nov 22 2005 Jason Baron jbaron@redhat.com [2.6.9-22.24]
-fix oops in raid1 code (Doug Ledford) -fix sysctl races (Alexander Viro) [168924] {CVE-2005-2709} -fix ia64 nested_dtlb_miss does not handle hugetlb address correctly (Dave Anderson) [168599] -fix incorrect BUG_ON in signal.c after do_coredump() (Dave Anderson) [165581] -speedup raw dio and aio paths (Geoff Gustafson) [167645] -diskdump - support compressing dump data (Akira Imamura) [171141] -fix kernel reporting init process cutime as very large negative value (Peter Staubach) [170146] -add host port id to fc transport class (Mike Christie) [163150] -add issue lisp fc class attribute (Mike Christie) [149294] -update qla2xxx driver version to 8.01.02-d3 (Mike Christie) -improve sctp receive buffer accounting (Neil Horman) [156602] -add pci ids for 915/945 graphics (Geoff Gustafson) [170517 173882] -ICH4L chipset support - pci id (Geoff Gustafson) [163171] -fix message queue refcounting (Alexander Viro) [169130] -retry iscsi portal address after session address failure (Mike Christie) [170656] -qla2xxx: update qlogic update (Mike Christie) [168544] -fix /proc/scsi/scsi DoS (Doug Ledford) [167696] {CVE-2005-2800} -Keys: Fix missed "struct key" types in reiserfs (David Howells) [171765] -Keys: Permit key expiry time to be set (David Howells) [173486] -Keys: Discard duplicate keys from a keyring on link (David Howells) [173486] -fix mount/umount can cause the block device reads to fail (Peter Staubach) [166589] -Fix syscall auditing success indication on IA64 (David Woodhouse) [173500] -Keys: Permit running process to instantiate keys (David Howells) [173493] -IPMI - bug fix for dmi table off by one erro (Peter Martuccelli) [173815]
* Tue Nov 22 2005 Jason Baron jbaron@redhat.com [2.6.9-22.23]
-fix oops in gss_pipe_release() (Steve Dickson) [169149 171112] -NFSv3 locking misses important kernel patches (Steve Dickson) [167192] -Various Device Mapper updates (Alasdair Kergon) [168483 168824 170864 172892 173155 173156 173157 173158 173159 173161 173163 173164 173174 173206 173360] -Support more detailed setting on partial dump (Keiichiro Tokunaga) [168638] -fix HFS oops (Peter Staubach) [171002] {CVE-2005-3109} -fix Kernel PANIC - not syncing: fatal exception (Steve Dickson) [163738] -x64: x86_64: EDAC support (Alan Cox) [158247] -must And iscsi opcode header (Mike Christie) [172487] -prevent module unloading for ide-scsi (Tom Coughlan) [169648] -fix NFS Cache invalidation bug in nfs v3 (Steve Dickson) [170423] -ia64: fix access to extended config space (Jason Baron) [146516]
* Mon Nov 21 2005 Jason Baron jbaron@redhat.com [2.6.9-22.22]
-js20++ cpu enablement (Nathan Lynch) [170542] -Generate a hotplug event when a CPU comes online (David Howells) [167469] -Eliminate "no IOMMU" panic on ASUS motherboards (Jim Paradis) [169115] -Update aacraid driver to 1.1-5[2412] (Tom Coughlan) [168567] -Cure IA64 unaligns in sk_filter() (David Miller) [169396] -ppc64: Assign CPUs to the correct NUMA node (David Howells) [164425] -Update powernow-k8.c to support RevF Opterons (Brian Maly) [162178 171058] -autofs4: don't expire in-use directory hierarchies (Jeff Moyer) [168431] -IPMI: various fixes (Peter Martuccelli) [168090 169629 168796 169859 168596 168558 173343]
* Sun Nov 20 2005 Jason Baron jbaron@redhat.com [2.6.9-22.21]
-x86_64: set NR_CPUS to 255 for largesmp kernel -Add initial Open Infiniband support (Doug Ledford) [108827 168445] -Add megaraid_sas driver (Tom Coughlan) [167926]
* Sat Nov 19 2005 Jason Baron jbaron@redhat.com [2.6.9-22.20]
-wake-balance optimizations (Ingo Molnar) [167645] -tg3: update to 3.43-rh (John Linville) [164892 165810 166111 167936 170527 168547] -add PQ3 scsi scanning support (Mike Christie) [155725] -add bnx2 driver (John Linville) [164825] -mii: update to support gigabit (John Linville) [164825] -Unisys Rascal Support (Brian Maly) [155017 137347 151986 157586 163847 167153 168604]
* Fri Nov 18 2005 Jason Baron jbaron@redhat.com [2.6.9-22.19]
-s390: add vm watchdog driver (Jan Glauber) [170358] -s390: add vmcp device driver (Jan Glauber) [168513] -s390: stop debug feature on oops (Jan Glauber) [168546] -s390: add vm logreader driver (Jan Glauber) [168573] -fix exec_mmap race DoS (Dave Anderson) [170262] {CAN-2005-3106} -s390: qeth driver layer 2 support (Jan Glauber) [168569 163741 168303] -CIFS upgrade from version 1.20 to 1.34a (Steve Dickson) [170502 166667 170498] -ppc64: inform hypervisor of VMX register use (Nathan Lynch) [170544] -USB deadlock fix (Kimball Murray) [171220] -fix calling disassociate ctty semantics (Jason Baron) [172740] -fix gdb crashes on hugemem (Dave Anderson) [171980]
* Wed Nov 16 2005 Jason Baron jbaron@redhat.com [2.6.9-22.18] -fix diskdump during OS_INIT (Norm Murray) [168262] -fix NaT bit corrupution with coredump (Dave Anderson) [168954] -fix ls hangs on krb5 mountd when user has not kinit-ed (Steve Dickson) [169184] -ide: serverworks ht1000 support (John Linville) [168478] -b44: alternate allocation option for DMA descriptors (John Linville) [161846] -x86_64: implement dma_sync_single_range_for_{cpu,device} (John Linville) [161846] -ia64: re-implement dma_get_cache_alignment to avoid EXPORT_SYMBOL (John Linville) [161846] -swiotlb: allow sync of DMA_BIDIRECTIONAL mappings (John Linville) [161846] -Fix missing finish_wait, resulting in OOPS (David Miller) [167211] -Power5+: Add power5+ CPU support (David Howells) [168566] -allow modules to load from prior kernel rpms (Jason Baron) [171989] -fix sys_get|setpriority() semantics fix (Ingo Molnar) [162731] -nfs: fix potential client oops when debugging is on (Steve Dickson) [169197]
* Mon Nov 14 2005 Jason Baron jbaron@redhat.com [2.6.9-22.17]
-correct wrong CPU frequency in acpi-cpufreq (Geoff Gustafson) [150893] -add 'noht' boot option to disable hyper-threading (Jim Paradis) [165747] -i810_audio: re-order release_region calls in i810_probe (John Linville) [165154] -fixup usb-handoff for hotplug (Kimball Murray) [146859] -x86_64: make spinlocks work for > 128 cpus (Ingo Molnar) [170711] -update SCSI whitelist (Tom Coughlan) [167932 160082 170448 172214]
* Fri Nov 11 2005 Jason Baron jbaron@redhat.com [2.6.9-22.16]
-pcnet32: fix leak in loopback test (John Linville) [163272] -bonding: replicate IGMP outbound traffic on inactive slaves (John Linville) [167630] -fix all-tasks-pinned (Ingo Molnar) [164444] -forcedeth: update to 0.41 (John Linville) [167927] -fix SHUTDOWN notification on 1:1 SCTP sockets (Neil Horman) [156785] -pcnet32: support ethtool set_ringparam (John Linville) [167729] -scheduler inlines (Geoff Gustafson) [167645] -noop merge optimizations (Geoff Gustafson) [167645] -ia64: hint@pause in udelay (Geoff Gustafson) [141699] -kprobes: scalability enhancements - lockless handler execution (Ananth Mavinakayanahalli) [170747] -cpu_relax on i386 and x86_64 (Geoff Gustafson) [141851] * Wed Nov 09 2005 Jason Baron jbaron@redhat.com [2.6.9-22.15]
-s390: correct wrong swap space offset (Jan Glauber) [171006] -aio run iocb optimization (Geoff Gustafson) [167645] -s390: fix kernel internal return values returned to userspace (Jan Glauber) [171374] -sym53c8xx_2 should only use PPR on LVD bus (Jeff Layton) [139949] -fix a null pointer dereference in netpoll (Jeff Moyer) [172595] -Add usb-handoff (Pete Zaitcev) [146859] -Use ethtool ops in VETH driver and name driver consistently (David Howells) [168129] -s390: fix cio path retry (Jan Glauber) [171013]
* Sat Nov 05 2005 Jason Baron jbaron@redhat.com [2.6.9-22.14]
-fix error messages during install, CD-ROM sizing (Pete Zaitcev) [162122 143539] -fix kobject_register failed for sdb1 (-17) (Pete Zaitcev) [153971 166281] -hotplug: fix Slot powered off after enabling (Keiichiro Tokunaga) [157241] -SGI arch 2.6.13 backport (Prarit Bhargava) [158959 168953 168952] -fix netdump hangs in processing of CPU stop after diskdump failed (Keiichiro Tokunaga) [170427] -fix diskdump can generate a corrupted dump if dump_level=4 is used (Nobuhiro Tachino) [169522] -allow renaming of directories located in NFS mounts (Neil Horman) [172081] -remove i2o_config debug printk (Mike Christie) [169075] -x86_64 largesmp update (Jim Paradis)
* Fri Nov 04 2005 Jason Baron jbaron@redhat.com [2.6.9-22.13]
-add ACL support for NFSv3 (Steve Dickson) [151549 158838]
* Thu Nov 03 2005 Jason Baron jbaron@redhat.com [2.6.9-22.12]
-fix restarts for SCTP associations (Neil Horman) [167907] -x86_64 reboot fix (Jim Paradis) [166888 168229 171950] -Fix VFS readahead performance problems for random large IOs (Stephen Tweedie) [167233] -Fix log_do_checkpoint() assert failures (Stephen Tweedie) [162814] -Fix ext3 reservations performance problems (Stephen Tweedie) [156437 167231] -Keys: Remove incorrect obsolete '!' operators (David Howells) [171705] -s390: fix pfault interrupt race (Jan Glauber) [171008] -s390: fix ptrace peek and poke problem (Jan Glauber) [171373] -fix read() with count > 0xffffffff panics kernel (Peter Staubach) [162094]
* Wed Nov 02 2005 Jason Baron jbaron@redhat.com [2.6.9-22.11]
-rename hugeproc kernels to largesmp -e1000: update to version 6.1.16-k2 (John Linville) [165118] -prevent BUG in prio_tree.c (Larry Woodman) [171778] -Fix locking bug in xmon (David Howells) [165584] -Evade hypervisor bug when setting the time during iSeries boot (David Howells) [168535] -Fix VSCSI client incorrect timeout in tape handling (David Howells) [164851] -Update qla2xxx driver version to 8.01.02-d2 (Mike Christie) [168544] -bonding: update docs (John Linville) [166603]
* Tue Nov 01 2005 Jason Baron jbaron@redhat.com [2.6.9-22.10]
-Improvements for key management facility (David Howells) [171705] -add supplementary rights to the mask for processes/threads that possess a key in a keyring -export user-defined key type operations -move the permissions check function from a .h file into a .c file -improve the request-key documentation -make possessor permissions additive with normal UID/GID/Other permissions -remove the key duplication facility -add LSM hooks for key management -fix a warning in kmod.c if keys are disabled
* Sat Oct 29 2005 Jason Baron jbaron@redhat.com [2.6.9-22.9]
-ia64: turn on SCHED_SMT (Geoff Gustafson) [158846] -Fix AVM B1 ISDN deadlock (David Woodhouse) [158848] -Fix usb keys (Pete Zaitcev) [160308] -typhoon: update to version 1.5.7 (John Linville) [167489] -fix blksectget 32-bit emulation breakage (Alexander Viro) [162906] -VETH: Reduce verbosity (David Howells) [145557] -Fix capifs oops (James Morris) [170487] -x86_64: add missing include/asm-i386/ files (David Woodhouse) [165115]
* Fri Oct 28 2005 Jason Baron jbaron@redhat.com [2.6.9-22.8]
-ia64: multi-core / multi-thread detection (Geoff Gustafson) [164470] -NFS/RPC - fix timestamp conversion (Steve Dickson) [165959] -NFS/RPC - fix PANIC at rpc_wake_up_status (Steve Dickson) [164298 161617] -fix nfsd oops on module unload/reload (Neil Horman) [165232] -system accounting can not handle largefiles (Peter Staubach) [165741] -s2io: update to 2.0.8.1 (John Linville) [167730 170887] -fix i2o passthrough ioctl return value (Mike Christie) [160546]
* Wed Oct 26 2005 Jason Baron jbaron@redhat.com [2.6.9-22.7]
-nfsd: clear signals before exiting the nfsd() thread (Steve Dickson) [171715] -add jasmine digi neo serial driver (Nathan Lynch) [168122 145370] -Keys: Fix key management syscall interface bugs (David Howells) [165092] -prevent panic in drop_buffers() (Larry Woodman) [162987] -bonding: ALB -- allow slave to use bond's MAC address if its own MAC address conflicts (John Linvile) [144477] -Fix netfilter reference bug in af_packet code (James Morris) [165744] -ia64: fix __copy_user for unaligned accesses (Neil Horman) [167634]
* Tue Oct 25 2005 Jason Baron jbaron@redhat.com [2.6.9-22.6]
-fix dangling POSIX locks after close (Peter Staubach) [160844] -exec-shield updates (Ingo Molnar) [152569] -ia32 apps that are not large file aware can access files >= 4GB (Peter Staubach) [144703] -autofs4: fix panic when using bind mounts (Jeff Moyer) [145374] -fix memory leak with large sendmsg/rcvmsg calls in 32-bit apps (Jeff Layton) [169875] -fix packet corruption in ip_conntrack_amanda (Jeff Layton) [152036] -kNFSd/RPC - umount fails on nfs server side when nfs client does heavy io (Steve Dickson) [154387] -fix TUX/ftp crash (Ingo Molnar) [172598]
* Sat Oct 22 2005 Jason Baron jbaron@redhat.com [2.6.9-22.5]
-create 'hugeproc' kernels: 512 cpus for ia64, 128 ppc64, 64 x86_64 (David Howells) [143166] -IPv6 address addition error handling fix (David Howells) [164547]
* Thu Oct 20 2005 Jason Baron jbaron@redhat.com [2.6.9-22.4]
-orinoco: plug etherleak (John Linville) [170277] {CAN-2005-3180} -fix sys_set_mempolicy() bounds check (Larry Woodman) [168993] {CAN-2005-3053} -fix race in ebtables (James Morris) [170268] {CAN-2005-3110} -fix memory leak in key management (David Howells) [170274] {CAN-2005-3119} -fix names_cache memory leak (David Woodhouse) [170283] {CAN-2005-3181} -nfs: add missing unlock_kernel() (Larry Woodman) [170546] -fix gzip/zlib flaws (Peter Staubach) [165679] {CAN-2005-2458}
* Wed Oct 12 2005 Jason Baron jbaron@redhat.com [2.6.9-22.3]
-veth: iSeries veth driver fixes (David Howells) [145557 157935] -Fix kallsyms vs insmod/rmmod race (David Howells) [145719] -net: Disable queueing when carrier is lost (John Linville) [165018 167115] -sys_get_thread_area has minor info leak (Larry Woodman) [168777] -fix false ECHILD result from wait (Dave Anderson) [166454 168775] -Add EINVAL to sys_io_cancel (Wendy Cheng) [162732] -x86_64: iounmap fix [168217 160135 170264] {CAN-2005-3108 } -x86_64: add pageattr text mapping (Jim Paradis) [170154]
* Fri Oct 07 2005 Jason Baron jbaron@redhat.com [2.6.9-22.2]
-fix NX text/large-page interaction [163238 168936] -Allow ICMP response source address configurable (David Miller) [164571] -fix oops in sysfs_remove_dir() (Pete Zaitcev) [161597] -fix ip_queue crash (James Morris) -bindresvport: Address already in use (Steve Dickson) [169042] -fix usb memory sticks, kabi fixes. (Pete Zaitcev) [167032] -Fix cpu sibling count with buggy BIOS on i686 (Eric Paris) [169472] -fix disassociate_ctty() vs. fork() race [165835]
* Tue Sep 27 2005 Jason Baron jbaron@redhat.com [2.6.9-22.1]
-remove: fix for NX text/large-page interaction (Ingo Molnar) [163238] -update hangcheck-timer to 0.9.0 and add for ia64, ppc64 and s390 [167731] -fix usb memory sticks (Pete Zaitcev) [167032]
Johnny Hughes wrote:
On Tue, 2006-01-03 at 18:34 -0700, Paul R. Ganci wrote:
I was wondering if anyone else has had problems with this hardware/driver and/or kernel-smp-2.6.9-22.0.1.EL. If this is a new mptscsih problem I will post more details of my system but I thought I would start with just a general question in case I missed something.
I have not seen this particular problem ... do you want to try the new 2.6.9-27.EL kernel that was released as part of EL4-u3beta
Also, verify you have the latest BIOS updates from you motherboard.
Alas, I have the original BIOS installed. I was checking and it does appear there are BIOS updates available. I always hate the idea of flashing a ROM given the implications of something going wrong. However, I was just checking /var/log/dmesg and am seeing things like:
mtrr: v2.0 (20020519) mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration.
BIOS failed to enable PCI standards compliance, fixing this error.
so perhaps I have a root cause to address.
I may just try the beta kernel anyhow ... if it doesn't work I will be able to capture the actual mptscsih error message. :(
"Paul R. Ganci" ganci@nurdog.com wrote:
Alas, I have the original BIOS installed. I was checking and it does appear there are BIOS updates available. I
always
hate the idea of flashing a ROM given the implications of something going wrong.
Actually, the overwhelming majority of PowerNow/Cool'n Quiet issues I've seen are due to inappropriate APIC/ACPI/POST setup by the BIOS. Nearly all BIOS updates fix these issues in my experience.
Bryan J. Smith wrote:
"Paul R. Ganci" ganci@nurdog.com wrote:
Alas, I have the original BIOS installed. I was checking and it does appear there are BIOS updates available.
Actually, the overwhelming majority of PowerNow/Cool'n Quiet issues I've seen are due to inappropriate APIC/ACPI/POST setup by the BIOS. Nearly all BIOS updates fix these issues in my experience.
Well I did more research and have to clarify my statement. The installed BIOS is v4.05 which indeed was the original BIOS is actually only one version below the last BIOS provided by Tyan for the Tiger MPX (S2466N-4M). The new features and fixes for v4.06 are:
Fixes bios resetting during reboot issue. Fixes hang on shutdown issue when no keyboard is present.
I find it hard to believe that flashing the BIOS to v4.06 is going to fix the mptscsi problem.
I did install 2.6.9-27.ELsmp. This kernel has the same problem as the 2.6.9-22.0.1.ELsmp. The actual error messages look like:
MPTSCSI: ioc0: attempting task abort! (sc=f24891c0) SCSI: destination target 1, lun 0 command = Test Unit Ready 00 00 00 00 00 MPTSCSI: ioc0: task abort: SUCCESS (sc=f24891c0)
These messages scroll up the monitor with the target drive and sc address changing. This system is only stable if I run 2.6.9-11.ELsmp. Unfortunately I don't find any other information in the log file related to the SCSI errrors indicated above ... the system clearly can't write to the drives and the raid5 array is corrupted. Again I find it hard to believe that the BIOS can be responsible since the system is so stable using kernel 2.6.9-11.ELsmp.
I have placed the latest dmesg log, an excerpt of the messages log and my grub.conf in http://www.nurdog.com/~ganci/crash/dmesg, http://www.nurdog.com/~ganci/crash/messages and http://www.nurdog.com/~ganci/crash/grub.conf respectively, in case someone would like to take a look. I would really like to get to the root of this problem and will be happy to provide any other information for anyone willing to help me debug this problem. In the meantime I will just run with 2.6.9-11.ELsmp.
Thanks for any help.
On Wed, 2006-01-04 at 23:43 -0700, Paul R. Ganci wrote:
Bryan J. Smith wrote:
"Paul R. Ganci" ganci@nurdog.com wrote:
Alas, I have the original BIOS installed. I was checking and it does appear there are BIOS updates available.
Actually, the overwhelming majority of PowerNow/Cool'n Quiet issues I've seen are due to inappropriate APIC/ACPI/POST setup by the BIOS. Nearly all BIOS updates fix these issues in my experience.
Well I did more research and have to clarify my statement. The installed BIOS is v4.05 which indeed was the original BIOS is actually only one version below the last BIOS provided by Tyan for the Tiger MPX (S2466N-4M). The new features and fixes for v4.06 are:
Fixes bios resetting during reboot issue. Fixes hang on shutdown issue when no keyboard is present.
I find it hard to believe that flashing the BIOS to v4.06 is going to fix the mptscsi problem.
I did install 2.6.9-27.ELsmp. This kernel has the same problem as the 2.6.9-22.0.1.ELsmp. The actual error messages look like:
MPTSCSI: ioc0: attempting task abort! (sc=f24891c0) SCSI: destination target 1, lun 0 command = Test Unit Ready 00 00 00 00 00 MPTSCSI: ioc0: task abort: SUCCESS (sc=f24891c0)
These messages scroll up the monitor with the target drive and sc address changing. This system is only stable if I run 2.6.9-11.ELsmp. Unfortunately I don't find any other information in the log file related to the SCSI errrors indicated above ... the system clearly can't write to the drives and the raid5 array is corrupted. Again I find it hard to believe that the BIOS can be responsible since the system is so stable using kernel 2.6.9-11.ELsmp.
I have placed the latest dmesg log, an excerpt of the messages log and my grub.conf in http://www.nurdog.com/~ganci/crash/dmesg, http://www.nurdog.com/~ganci/crash/messages and http://www.nurdog.com/~ganci/crash/grub.conf respectively, in case someone would like to take a look. I would really like to get to the root of this problem and will be happy to provide any other information for anyone willing to help me debug this problem. In the meantime I will just run with 2.6.9-11.ELsmp.
Don't find it hard to believe ... try flashing the BIOS ... :) I have had problems like this many, many, many times.
They add lots of things besides what they list in BIOS ugrades.
It might not fix it (I have no experience with this SPECIFIC board) but that is always the first thing I check.
On Thu, 2006-01-05 at 06:12 -0600, Johnny Hughes wrote:
On Wed, 2006-01-04 at 23:43 -0700, Paul R. Ganci wrote:
Bryan J. Smith wrote:
"Paul R. Ganci" ganci@nurdog.com wrote:
Alas, I have the original BIOS installed. I was checking and it does appear there are BIOS updates available.
Actually, the overwhelming majority of PowerNow/Cool'n Quiet issues I've seen are due to inappropriate APIC/ACPI/POST setup by the BIOS. Nearly all BIOS updates fix these issues in my experience.
Well I did more research and have to clarify my statement. The installed BIOS is v4.05 which indeed was the original BIOS is actually only one version below the last BIOS provided by Tyan for the Tiger MPX (S2466N-4M). The new features and fixes for v4.06 are:
Fixes bios resetting during reboot issue. Fixes hang on shutdown issue when no keyboard is present.
I find it hard to believe that flashing the BIOS to v4.06 is going to fix the mptscsi problem.
I did install 2.6.9-27.ELsmp. This kernel has the same problem as the 2.6.9-22.0.1.ELsmp. The actual error messages look like:
MPTSCSI: ioc0: attempting task abort! (sc=f24891c0) SCSI: destination target 1, lun 0 command = Test Unit Ready 00 00 00 00 00 MPTSCSI: ioc0: task abort: SUCCESS (sc=f24891c0)
These messages scroll up the monitor with the target drive and sc address changing. This system is only stable if I run 2.6.9-11.ELsmp. Unfortunately I don't find any other information in the log file related to the SCSI errrors indicated above ... the system clearly can't write to the drives and the raid5 array is corrupted. Again I find it hard to believe that the BIOS can be responsible since the system is so stable using kernel 2.6.9-11.ELsmp.
I have placed the latest dmesg log, an excerpt of the messages log and my grub.conf in http://www.nurdog.com/~ganci/crash/dmesg, http://www.nurdog.com/~ganci/crash/messages and http://www.nurdog.com/~ganci/crash/grub.conf respectively, in case someone would like to take a look. I would really like to get to the root of this problem and will be happy to provide any other information for anyone willing to help me debug this problem. In the meantime I will just run with 2.6.9-11.ELsmp.
Don't find it hard to believe ... try flashing the BIOS ... :) I have had problems like this many, many, many times.
They add lots of things besides what they list in BIOS ugrades.
It might not fix it (I have no experience with this SPECIFIC board) but that is always the first thing I check.
I am using that same driver on a slightly different SCSI controller ...
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
With no major issues at all.
On Thu, 2006-01-05 at 06:59 -0600, Johnny Hughes wrote:
On Thu, 2006-01-05 at 06:12 -0600, Johnny Hughes wrote:
On Wed, 2006-01-04 at 23:43 -0700, Paul R. Ganci wrote:
Bryan J. Smith wrote:
"Paul R. Ganci" ganci@nurdog.com wrote:
Alas, I have the original BIOS installed. I was checking and it does appear there are BIOS updates available.
Actually, the overwhelming majority of PowerNow/Cool'n Quiet issues I've seen are due to inappropriate APIC/ACPI/POST setup by the BIOS. Nearly all BIOS updates fix these issues in my experience.
Well I did more research and have to clarify my statement. The installed BIOS is v4.05 which indeed was the original BIOS is actually only one version below the last BIOS provided by Tyan for the Tiger MPX (S2466N-4M). The new features and fixes for v4.06 are:
Fixes bios resetting during reboot issue. Fixes hang on shutdown issue when no keyboard is present.
I find it hard to believe that flashing the BIOS to v4.06 is going to fix the mptscsi problem.
I did install 2.6.9-27.ELsmp. This kernel has the same problem as the 2.6.9-22.0.1.ELsmp. The actual error messages look like:
MPTSCSI: ioc0: attempting task abort! (sc=f24891c0) SCSI: destination target 1, lun 0 command = Test Unit Ready 00 00 00 00 00 MPTSCSI: ioc0: task abort: SUCCESS (sc=f24891c0)
These messages scroll up the monitor with the target drive and sc address changing. This system is only stable if I run 2.6.9-11.ELsmp. Unfortunately I don't find any other information in the log file related to the SCSI errrors indicated above ... the system clearly can't write to the drives and the raid5 array is corrupted. Again I find it hard to believe that the BIOS can be responsible since the system is so stable using kernel 2.6.9-11.ELsmp.
I have placed the latest dmesg log, an excerpt of the messages log and my grub.conf in http://www.nurdog.com/~ganci/crash/dmesg, http://www.nurdog.com/~ganci/crash/messages and http://www.nurdog.com/~ganci/crash/grub.conf respectively, in case someone would like to take a look. I would really like to get to the root of this problem and will be happy to provide any other information for anyone willing to help me debug this problem. In the meantime I will just run with 2.6.9-11.ELsmp.
Don't find it hard to believe ... try flashing the BIOS ... :) I have had problems like this many, many, many times.
They add lots of things besides what they list in BIOS ugrades.
It might not fix it (I have no experience with this SPECIFIC board) but that is always the first thing I check.
I am using that same driver on a slightly different SCSI controller ...
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
With no major issues at all.
I looked at the tyan site and see that the SCSI controller is not built on to the board ... so it is a standalone card?
Does it have the latest bios?
Johnny Hughes wrote:
On Thu, 2006-01-05 at 06:59 -0600, Johnny Hughes wrote:
I looked at the tyan site and see that the SCSI controller is not built on to the board ... so it is a standalone card?
Does it have the latest bios?
Yes and no. I have BIOS 5.11.00 and it appears that there is a BIOS 5.11.01. Of even more interest it appears that there are two different BIOS. 5.10.03 contains support for integrated mirroring and 5.11.01 for integrated striping. I will try flashing both the board and the SCSI controller and see what happens.
Johnny Hughes wrote:
On Wed, 2006-01-04 at 23:43 -0700, Paul R. Ganci wrote:
Bryan J. Smith wrote:
Nearly all BIOS updates fix these issues in my experience.
I find it hard to believe that flashing the BIOS to v4.06 is going to fix the mptscsi problem.
Don't find it hard to believe ... try flashing the BIOS ... :) I have had problems like this many, many, many times.
They add lots of things besides what they list in BIOS ugrades.
It might not fix it (I have no experience with this SPECIFIC board) but that is always the first thing I check.
Okay Johnny and Bryan, you made a believer out of me. I flashed the Tyan motherboard (S2466N-4M) BIOS to version V4.06, and the LSI SCSI card (LSI53C1030) BIOS/firmware to version 5.11.01 respectively found on theTyan and LSI Logic web sites. So far the initial result is good ... after 8 hours the system is up happily chugging along:
uptime
08:15:32 up 8:10, 4 users, load average: 0.56, 0.31, 0.27
uname -r
2.6.9-27.ELsmp
This system typically dies within the first 5 minutes running this kernel. It has lasted for as long as ~30 minutes if limited disk writes are attempted but has never remained stable for 8 hours. Thanks for "forcing" me to update the BIOS/firmware as this action so far seems to have provided the cure.
Bryan J. Smith wrote:
Well I did more research and have to clarify my statement. The installed BIOS is v4.05 which indeed was the original BIOS is actually only one version below the last BIOS provided by Tyan for the Tiger MPX (S2466N-4M).
I'm confused. I didn't know Athlon MP's were PowerNow!/Cool'n Quiet processors. What exact models are you trying to use?
I've never seen an old Athlon MP system with such power management.
The new features and fixes for v4.06 are: Fixes bios resetting during reboot issue. Fixes hang on shutdown issue when no keyboard is present. I find it hard to believe that flashing the BIOS to v4.06 is going to fix the mptscsi problem.
Oh, this is unrelated to the PowerNow/Cool'n Quiet. My apologies (did I cross threads/concepts?).
Bryan J. Smith wrote:
What exact models are you trying to use?
I've never seen an old Athlon MP system with such power management.
The Tyan Tiger MPX S2466N-4M motherboard with BIOS version v4.05 runs happily with dual Athlon MP 2800+ processors. Hard to believe that the system was only put together a little over two years ago and now it is old. :)
My apologies (did I cross threads/concepts?).
I think so but no problem.
On Tue, 2006-01-03 at 18:34 -0700, Paul R. Ganci wrote:
I have a Tyan Tiger S2466 MPX motherboard with Dual Atlon MP 2800+ CPUs and 1GB PC2100 DDR SDRAM. For disk drive I have an LSI53C1030 and 4 Seagate ST336607LWs in a software raid 5 configuration. I installed Centos 4.1 and everything was fine running kernel-smp-2.6.9-11.EL. However when I upated to Centos 4.2 I have run into problems. Namely after a finite amount of disk traffic the system locks completely up. On the monitor console I get message after message from the MPT Fusion SCSI driver mptscsih indicating that there was a failure and that an "ABORT was successful". Unfortunately I don't have the exact message, but I have tracked the problem to kernel-smp-2.6.9-22.0.1.EL. Since I updated from 4.1 I still had the smp-2.6.9-11 kernel around and as long as I boot into kernel-smp-2.6.9-11.EL with all other 4.2 updates installed everything is stable. I can reliably get the mptscsih driver to fail after a few minutes of system uptime (or shorter time if doing disk writes) when booted into kernel-smp-2.6.9-22.0.1.EL.
I checked the Centos archives and did not find anything related to this SCSI card or motherboard. Now before I get lambasted for not having the exact SCSI error message (yes I am willing boot into kernel-smp-2.6.9-22.0.1.EL despite that my raid partition has to be rebuilt afterwards) to get the message. I was wondering if anyone else has had problems with this hardware/driver and/or kernel-smp-2.6.9-22.0.1.EL. If this is a new mptscsih problem I will post more details of my system but I thought I would start with just a general question in case I missed something.
---- firmware up to date?
Craig