On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link
<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I *think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I said, have fits.
mark "you'd think I just spent months dealing with this...."
As mentioned in the original post the drives are all "WD2003FYYS". I am convinced it has nothing to do with TLER enabled on the WD drives as we run hundreds of them using linux mdadm raid on motherboard SATA controllers with no problems in the last eight or so years. This appears to be specific to the SATA PCIe 6Gbps 4 port controller card made by Startech. There are four other HD's (WD2003FYYS) in the machine running on an onboard "Intel Corporation Patsburg 6-Port SATA AHCI Controller" with no problems.
Steve