We're having a problem with Linux and/or the SCSI controller failing to access an external RAID device. Components are:
SuperMicro 6025B-TR+V CentOS 5.0 with yum update as of a week or two ago (kernel 2.6.18-8.1.8.el5) Promise Vtrak M310p Adaptec 29320 LPE -- aic79xx module
Sometimes this fails on the initial device probing (before the kernel even begins to boot); sometimes the device is recognized ok and then we get a failure later on in the boot sequence (and then the device disappears).
RedHat Linux is in the compatibility list for that Adaptec card, and the card is listed as compatible with the Promise VTrak. We were able to get this combination to work by giving up on the Ultra320 setting and dropping back to 160 in the controller setup, but if anyone has suggestions or info it would be appreciated.
Here's the dmesg output:
sd 2:0:0:0: Attempting to queue an ABORT message:CDB: 0x1a 0x0 0x5c 0x0 0x40 0x0 scsi2: At time of recovery, card was not paused
>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi2: Dumping Card State at program address 0x4 Mode 0x22 Card was paused INTSTAT[0x0] SELOID[0x0] SELID[0x0] HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x33] SCSISIGI[0x24] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x0] SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x4] QFREEZE_COUNT[0x0] KERNEL_QFREEZE_COUNT[0x0] MK_MESSAGE_SCB[0xff00] MK_MESSAGE_SCSIID[0xff] SSTAT0[0x0] SSTAT1[0x8] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0] SIMODE1[0xa4] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x80] LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0xe1]
SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0xffff CURRSCB 0x2 NEXTSCB 0xffc0 qinstart = 85 qinfifonext = 85 QINFIFO: WAITING_TID_QUEUES: Pending list: 2 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 3 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: Sequencer On QFreeze and Complete list:
scsi2: FIFO0 Free, LONGJMP == 0x825a, SCB 0x2 SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
scsi2: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3 SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] LQIN: 0x8 0x0 0x0 0x2 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x 0 0x0 scsi2: LQISTATE = 0x1, LQOSTATE = 0x0, OPTIONMODE = 0x52 scsi2: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x1 scsi2: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0 SIMODE0[0xc] CCSCBCTL[0x4] scsi2: REG0 == 0x2, SINDEX = 0x102, DINDEX = 0x102 scsi2: SCBPTR == 0xff02, SCB_NEXT == 0xff00, SCB_NEXT2 == 0x0 CDB 2 1 0 0 0 0 STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> sd 2:0:0:0: Unable to deliver message scsi2: Command abort returning 0x2003 sd 2:0:0:0: Attempting to queue a TARGET RESET message:CDB: 0x1a 0x0 0x5c 0x0 0x 40 0x0 scsi2: Device reset code sleeping scsi2: Device reset timer expired (active 2) scsi2: Device reset returning 0x2003 Recovery SCB completes sd 2:0:0:0: scsi: Device offlined - not ready after error recovery
Bart Schaefer spake the following on 9/5/2007 9:26 AM:
We're having a problem with Linux and/or the SCSI controller failing to access an external RAID device. Components are:
SuperMicro 6025B-TR+V CentOS 5.0 with yum update as of a week or two ago (kernel 2.6.18-8.1.8.el5) Promise Vtrak M310p Adaptec 29320 LPE -- aic79xx module
Sometimes this fails on the initial device probing (before the kernel even begins to boot); sometimes the device is recognized ok and then we get a failure later on in the boot sequence (and then the device disappears).
RedHat Linux is in the compatibility list for that Adaptec card, and the card is listed as compatible with the Promise VTrak. We were able to get this combination to work by giving up on the Ultra320 setting and dropping back to 160 in the controller setup, but if anyone has suggestions or info it would be appreciated.
There has been a lot of traffic on Adaptec cards and ultra320 not working properly with the current drivers. The only fix I have seen posted is to get an LSI card.