Folks,
Since approx. a half year I have run two servers with CentOS4, all was
running very stable.
Then I upgraded to CentOS5 (fresh install). The servers are running
quite unstable since the upgrade. The system is freezing after short uptime.
I suspect the driver for the 3ware controller, because I got such error
messages:
3w-xxxx: scsi0: Character ioctl (0x1f) timed out, resetting card.
3w-xxxx: scsi0: Character ioctl (0x1f) timed out, resetting card.
3w-xxxx: scsi0: Character ioctl (0x1f) timed out, resetting card.
(lots of repeats of this messages)
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
EXT3-fs error (device sda6): ext3_find_entry: reading directory #2 offset 0
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
sd 0:0:0:0: rejecting I/O to offline device
After some time (few hours) the systems are completely freezing, no
display of errors on the console or logs
Here some relevant data of my system:
00:0a.0 RAID bus controller: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID
(rev 01)
Subsystem: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2250ns min), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 185
Region 0: I/O ports at e100 [size=16]
Region 1: Memory at fa824000 (32-bit, non-prefetchable) [size=16]
Region 2: Memory at fa000000 (32-bit, non-prefetchable) [size=8M]
[virtual] Expansion ROM at 80060000 [disabled] [size=64K]
Capabilities: [40] Power Management version 1
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Kernel: 2.6.18-8.1.6.el5
It cannot be a 3ware hardware error, because the controller was
exchanged with the same results afterwards.
Any hints are very welcome.
Thanks,
Luc