On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
On Fri, 22 Jun 2012, m.roth@5-cent.us wrote:
Steve Brooks wrote:
I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as
Marvell Technology Group Ltd. 88SE9123
I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this
Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link
<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I * think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I
said,
have fits.
mark "you'd think I just spent months dealing with this...."
As mentioned in the original post the drives are all "WD2003FYYS". I am
Missed the original post; sorry.
convinced it has nothing to do with TLER enabled on the WD drives as we
Thanks, that was the acronym I was trying to remember.
run hundreds of them using linux mdadm raid on motherboard SATA controllers with no problems in the last eight or so years. This appears to be specific to the SATA PCIe 6Gbps 4 port controller card made by Startech. There are four other HD's (WD2003FYYS) in the machine running on an onboard "Intel Corporation Patsburg 6-Port SATA AHCI Controller" with no problems.
I also see those are "enterprise" drives, not consumer grade, which implies that they ought to work. It still looks to me as though it's timing out, which I'd think is a function of the RAID card. You might see if it has any firmware configuration options.
Thanks for the reply, the card is purely JBOD no RAID or other configuration available. It simply posts the SATA devices attached to the OS. I am wondering if it could be a strange symptom of running SATA3 drives on this particular SATA6 controller but that is just a stab in the dark.