Hi,
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system.
The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller.
However I am seeing lots of instances of errors like this
-----------------------------------------
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link Jun 22 03:13:24 viz1 kernel: ata13: SATA link up 3.0 Gbps (SStatus 123 SControl 330) Jun 22 03:13:24 viz1 kernel: ata13.00: configured for UDMA/133 Jun 22 03:13:24 viz1 kernel: ata13: EH complete
---------------------------------------
Vendor ID : 1b4b Device ID : 9123
I tried to see what drivers were currently being used but the command below gave nothing
grep -i 1b4b /lib/modules/*/modules.alias | grep -i 9123
I have changed the card and cables but still get the same errors. I am wondering if the el6 kernel is using the correct drivers I checked "elrepo" against the "Vendor:Device ID pairing" and it also came up blank.
Any ideas would be much appreciated.
Regards,
Steve
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link
<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I *think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I said, have fits.
mark "you'd think I just spent months dealing with this...."
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link
<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I *think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I said, have fits.
mark "you'd think I just spent months dealing with this...."
As mentioned in the original post the drives are all "WD2003FYYS". I am convinced it has nothing to do with TLER enabled on the WD drives as we run hundreds of them using linux mdadm raid on motherboard SATA controllers with no problems in the last eight or so years. This appears to be specific to the SATA PCIe 6Gbps 4 port controller card made by Startech. There are four other HD's (WD2003FYYS) in the machine running on an onboard "Intel Corporation Patsburg 6-Port SATA AHCI Controller" with no problems.
Steve
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link
<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I * think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I
said,
have fits.
mark "you'd think I just spent months dealing with this...."
As mentioned in the original post the drives are all "WD2003FYYS". I am
Missed the original post; sorry.
convinced it has nothing to do with TLER enabled on the WD drives as we
Thanks, that was the acronym I was trying to remember.
run hundreds of them using linux mdadm raid on motherboard SATA controllers with no problems in the last eight or so years. This appears to be specific to the SATA PCIe 6Gbps 4 port controller card made by Startech. There are four other HD's (WD2003FYYS) in the machine running on an onboard "Intel Corporation Patsburg 6-Port SATA AHCI Controller" with no problems.
I also see those are "enterprise" drives, not consumer grade, which implies that they ought to work. It still looks to me as though it's timing out, which I'd think is a function of the RAID card. You might see if it has any firmware configuration options.
mark
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link
<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I * think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I
said,
have fits.
mark "you'd think I just spent months dealing with this...."
As mentioned in the original post the drives are all "WD2003FYYS". I am
Missed the original post; sorry.
convinced it has nothing to do with TLER enabled on the WD drives as we
Thanks, that was the acronym I was trying to remember.
run hundreds of them using linux mdadm raid on motherboard SATA controllers with no problems in the last eight or so years. This appears to be specific to the SATA PCIe 6Gbps 4 port controller card made by Startech. There are four other HD's (WD2003FYYS) in the machine running on an onboard "Intel Corporation Patsburg 6-Port SATA AHCI Controller" with no problems.
I also see those are "enterprise" drives, not consumer grade, which implies that they ought to work. It still looks to me as though it's timing out, which I'd think is a function of the RAID card. You might see if it has any firmware configuration options.
Thanks for the reply, the card is purely JBOD no RAID or other configuration available. It simply posts the SATA devices attached to the OS. I am wondering if it could be a strange symptom of running SATA3 drives on this particular SATA6 controller but that is just a stab in the dark.
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
Is this your card?
mark
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
Is this your card?
Hi Mark,
Yes that is the very card, the page says the chipset is Marvell 88SE9128 but "lspci" shows
Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller
Steve
On Fri, 22 Jun 2012, Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote: > > I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The > kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as > > Marvell Technology Group Ltd. 88SE9123 >
Is this your card?
Hi Mark,
Yes that is the very card, the page says the chipset is Marvell 88SE9128 but "lspci" shows
Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller
It is odd because the kernel reports it as "88SE9123" the web page says it is "88SE9128" as does the manual supplied with the card. Now the motherboard already has an onboard Marvell "88SE9128" controller which is correctly identified by the kernel and works properly so I know the correct divers are in the kernel but the Startech card does not seem to be using them.
[root@viz1 ~]# lspci | grep SATA 00:1f.2 SATA controller: Intel Corporation Patsburg 6-Port SATA AHCI Controller (rev 05) 04:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller (rev 11) 05:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller (rev 11) 0f:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01) 10:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9128 PCIe SATA 6 Gb/s RAID controller with HyperDuo (rev 11)
Hi, Steve,
Steve Brooks wrote:
On Fri, 22 Jun 2012, Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote: > Steve Brooks wrote: >> >> I have a SATA PCIe 6Gbps 4 port controller card made by Startech. >> The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as >> >> Marvell Technology Group Ltd. 88SE9123 >>
Is this your card?
Yes that is the very card, the page says the chipset is Marvell 88SE9128 but "lspci" shows
Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller
It is odd because the kernel reports it as "88SE9123" the web page says it is "88SE9128" as does the manual supplied with the card. Now the
Yeah, I noticed that too, and thought it odd. <snip> I looked at the "manual", and the only thing that came to mind was to try going into the BIOS and making sure that it was set to AHCI rather than, say, IDE, or whatever.
mark
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Hi, Steve,
Steve Brooks wrote:
On Fri, 22 Jun 2012, Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote: > On Fri, 22 Jun 2012, m.roth@5-cent.us wrote: >> Steve Brooks wrote: >>> >>> I have a SATA PCIe 6Gbps 4 port controller card made by Startech. >>> The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as >>> >>> Marvell Technology Group Ltd. 88SE9123 >>>
Is this your card?
Yes that is the very card, the page says the chipset is Marvell 88SE9128 but "lspci" shows
Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller
It is odd because the kernel reports it as "88SE9123" the web page says it is "88SE9128" as does the manual supplied with the card. Now the
Yeah, I noticed that too, and thought it odd.
<snip> I looked at the "manual", and the only thing that came to mind was to try going into the BIOS and making sure that it was set to AHCI rather than, say, IDE, or whatever.
Thanks Mark for the reply , I hadn't thought about the card posting drives in the bios, I assumed only the onboard SATA devices would allow you change the mode in the motherboard's BIOS. I will have a look on Monday to see if anything has appeared in the BIOS. I guess the default mode in a SATA6 card would be AHCI but yes worth a check.
Cheers,
Steve