[CentOS] Screaming Interrupt error and hang.

Mon Jun 27 19:14:03 UTC 2005
Lamar Owen <lowen at pari.edu>

One system I have is configured thusly:
AMD Duron 1200 (I have frequency restrictions in this application, and must 
stay below 1.3GHz or must be above 2.0GHz, and the 1200 fits nicely in our 
interference plan; besides, I was given the chip, and it's fast enough for 
the application).
Soyo DRAGON KT400 motherboard, 512MB DDR RAM (DDR266).  Silicon Image SATA 
interface card.

Machine boots CentOSPlus 4.1 kernel (2.6.9-11.106.unsupported) perfectly fine 
UNTIL I plug in a Maxtor SATA drive into the drive slide I have in the 
machine.  Drive and slide work perfectly fine at home with a Gigabyte 
KT600-based motherboard with VIA SATA interface.  After plugging in the SATA 
drive and rebooting (since libata doesn't yet do hotplug), I get a hang after 
seeing an 'irq 10: nobody cared! (screaming interrupt?)' error.  

I have done the kernel command line 'pci=biosirq' and 'acpi=off' suggestions, 
I have disabled USB, disabled everything that the BIOS will allow.  Nothing 
at this point is sharing IRQ 10 with the Sil 3112 controller.  I think it is 
the latest BIOS version for that motherboard.  No matter what I do, it issues 
the screaming interrupt error until I remove the SATA drive.

The DRAGON motherboard has HPT372 software RAID on board (I'm using it as just 
another two channel ATA host adapter) as well as audio, network, etc).  LAN 
is enabled; both COMs and the LPT are disabled, USB is disabled, audio is 
disabled.

And the motherboard works and has worked perfectly fine for months without the 
SATA drive in the slide (and with all the peripherals enabled!).

Red Hat Bugzilla unresponsive; maybe someone here has seen the problem and 
knows of some other thing I can tweak.  Odd that it works fine with the card 
in with no drive, but plugging the drive in kills it.  With the drive not 
installed, the card still uses the IRQ, still loads its BIOS, and still shows 
up in dmesg (this produced with a default boot, non-quiet):
[snip]
ACPI: (supports S0 S1 S4 S5)
ACPI wakeup devices:
SLPB PCI0 USB0 USB1 USB2 USB6 USB7 USB8 USB9 LAN0
Freeing unused kernel memory: 148k freed
SCSI subsystem initialized
libata version 1.10 loaded.
sata_sil version 0.8
ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 10 (level, low) -> IRQ 10
ata1: SATA max UDMA/100 cmd 0xE0832080 ctl 0xE083208A bmdma 0xE0832000 irq 10
ata2: SATA max UDMA/100 cmd 0xE08320C0 ctl 0xE08320CA bmdma 0xE0832008 irq 10
ata1: no device found (phy stat 00000000)
scsi0 : sata_sil
ata2: no device found (phy stat 00000000)
scsi1 : sata_sil
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm at uk.sistina.com
[snip]

It's in the area after the sata_sil driver gets loaded that, when the drive is 
plugged in, the screaming interrupt occurs, and the system hangs.  It DOES 
detect the drive, though.  Oh, and let me head off the obvious thing at the 
pass: I'm not using any pseudo-RAID features on either the HPT372 or the Sil 
3112.  Both chips work fine;  the Sil 3112 card was in a Windows box 
previously and has worked with this drive.  The HPT372 is working FINE right 
now as two independent ATA channels.

Just looking for some ideas of things to look at (other than 'don't use a 
FRAID card' 'try another motherboard' and the like; those pseudoanswers will 
be ignored; please only answer if you have an idea, don't waste bandwidth 
telling me you don't know, but that SOYO motherboards are junk, etc.  Too 
much of that around here, unfortunately, and it is annoying).
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu