[CentOS] Storage/SCSI Error on our CentOS server

Sat Feb 24 20:07:22 UTC 2007
Hairul Ikmal Mohamad Fuzi <hairul.ikmal at gmail.com>

Hi,

Currently we are running CentOS 4.x on a 2-way Opteron machine.
This machine, through a SCSI host adapter (Adaptec), is connected to a
2TB storage unit (an external RAID-5 disk array)

Until our recent unintentional power trip, everything was fine and smooth.
We have been experiencing complication accessing the storage ( it
could be either intermittent filesystem error, partition could not be
mounted in read-write mode, unacceptable writing speed, etc ),
especially when we start to 'write' on the storage.

After a few check, we are suspecting either :

1) the storage unit (but the storage control panel did not report any
disk/raidset failure) is failing or,
2) the SCSI host adapter is failing, or
3) the filesystem itself is corrupted (we did 'fsck.ext3 -v -f' but it
turned out it did not find any errors)

..but we are not sure which one.We did received 'messages' which we
never experience previously in our 'dmesg' . (please refer below)
Based on the above info and the below 'dmesg' output, we'd appreciate
if somebody could share and help us to identify what actually went
wrong and how could fix it (if possible)?

TIA.

-Ikmal

Dmesg output :
===========================================
scsi4:0:0:0: Attempting to abort cmd 0000010082297a80:
0x28 0x0 0x0 0x0 0x1 0x3f 0x0 0x0 0x8 0x0
scsi4: At time of recovery, card was not paused
>>>>>>>>>>>>>>>>>> Dump Card State Begins
<<<<<<<<<<<<<<<<<
scsi4: Dumping Card State at program address 0xc Mode
0x33
Card was paused
HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0]
SAVED_MODE[0x11]
DFFSTAT[0x33] SCSISIGI[0x0] SCSIPHASE[0x0]
SCSIBUS[0x0]
LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12]
SEQCTL0[0x0]
SEQINTCTL[0x0] SEQ_FLAGS[0xc0] SEQ_FLAGS2[0x0]
SSTAT0[0x0]
SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0]
SIMODE1[0xa4] LQISTAT0[0x0] LQISTAT1[0x0]
LQISTAT2[0x0]
LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]

SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0xffff CURRSCB
0x2 NEXTSCB 0x0
qinstart = 59 qinfifonext = 59
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
 2 FIFO_USE[0x0] SCB_CONTROL[0x64] SCB_SCSIID[0x7]
Total 1
Kernel Free SCB list: 3 1 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:

scsi4: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0]
DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
scsi4: FIFO1 Free, LONGJMP == 0x81d8, SCB 0x3
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0]
DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
scsi4: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE =
0x52
scsi4: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
SIMODE0[0xc]
CCSCBCTL[0x0]
scsi4: REG0 == 0xffff, SINDEX = 0x1e0, DINDEX = 0xe1
scsi4: SCBPTR == 0x3, SCB_NEXT == 0x2, SCB_NEXT2 ==
0x2
CDB 28 0 0 80 19 ac
STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
<<<<<<<<<<<<<<<<< Dump Card State Ends
>>>>>>>>>>>>>>>>>>
DevQ(0:0:0): 0 waiting
(scsi4:A:0:0): Device is disconnected, re-queuing SCB
Recovery code sleeping
Recovery SCB completes
Recovery code awake
scsi4: Transmission error detected
LQISTAT1[0x0] LASTPHASE[0x1] SCSISIGI[0x0]
PERRDIAG[0x1]
>>>>>>>>>>>>>>>>>> Dump Card State Begins
<<<<<<<<<<<<<<<<<
scsi4: Dumping Card State at program address 0x26 Mode
0x11
Card was paused
HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0]
SAVED_MODE[0x11]
DFFSTAT[0x33] SCSISIGI[0x1a] SCSIPHASE[0x1]
SCSIBUS[0xff]
LASTPHASE[0x1] SCSISEQ0[0x40] SCSISEQ1[0x12]
SEQCTL0[0x0]
SEQINTCTL[0x0] SEQ_FLAGS[0xc0] SEQ_FLAGS2[0x0]
SSTAT0[0x10]
SSTAT1[0x11] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0]
SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0]
LQISTAT2[0x0]
LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]

SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0xffff CURRSCB
0x2 NEXTSCB 0x0
qinstart = 61 qinfifonext = 61
QINFIFO:
WAITING_TID_QUEUES:
      0 ( 0x2 )
Pending list:
 2 FIFO_USE[0x0] SCB_CONTROL[0x50] SCB_SCSIID[0x7]
Total 1
Kernel Free SCB list: 3 1 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:

scsi4: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0]
DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x1] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
scsi4: FIFO1 Free, LONGJMP == 0x81d8, SCB 0x3
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0]
DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x1] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
scsi4: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE =
0x52
scsi4: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
SIMODE0[0xc]
CCSCBCTL[0x4]
scsi4: REG0 == 0x3, SINDEX = 0x11d, DINDEX = 0xe1
scsi4: SCBPTR == 0x3, SCB_NEXT == 0x2, SCB_NEXT2 ==
0x2
CDB 28 0 0 80 19 ac
STACK: 0x13 0x0 0x0 0x0 0x0 0x0 0x0 0x0
<<<<<<<<<<<<<<<<< Dump Card State Ends
>>>>>>>>>>>>>>>>>>
DevQ(0:0:0): 0 waiting
(scsi4:A:0): 80.000MB/s transfers (40.000MHz DT,
16bit)
===========================================