[CentOS] Sun Fire X4200 M2 / CentOS 5 APIC issues

Wed Jan 9 19:22:01 UTC 2008
Ryan Ordway <rordway at oregonstate.edu>

I had been running CentOS 5 happily on my Sun Fire X4200 M2 systems,  
then I upgraded the BIOS and iLOM firmware. Now I'm running into what  
seems to be a fairly common problem with newer motherboards. I cannot  
boot unless I use the 'noapic' kernel option. If I try to boot the  
kernel normally, I get the error:

"MP-BIOS bug: 8254 timer not connected to IO-APIC"

I can boot using noapic, but interrupts are *HORRIBLE* with noapic on  
the X4200 M2. Here is a system with the new BIOS running with noapic:

[rordway at aphrodite ~]$ uname -a
Linux aphrodite 2.6.18-53.1.4.el5.centos.plus #1 SMP Fri Dec 7  
07:05:12 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
[rordway at aphrodite ~]$ cat /proc/interrupts
            CPU0       CPU1       CPU2       CPU3
   0:  149047419     568805     215848      76392          XT-PIC  timer
   1:          0          2          0          0          XT-PIC  i8042
   2:          0          0          0          0          XT-PIC   
cascade
   4:        399         80         58         26          XT-PIC   
serial
   5:        138         19          4          4          XT-PIC   
ehci_hcd:usb2
   7:   16203243  179851060  180334480  180522274          XT-PIC   
ioc0, eth1
   8:          1          0          0          0          XT-PIC  rtc
   9:          0          0          0          0          XT-PIC  acpi
  11:        764         64         17         21          XT-PIC   
ohci_hcd:usb1
  12:          2          2          0          0          XT-PIC  i8042
  14:         77          6         13          3          XT-PIC  ide0
  15:   15353489     192264      58884      10141          XT-PIC  eth0
NMI:          0          0          0          0
LOC:  149884909  149891944  149889634  149890602
ERR:  377601098
MIS:          0


Compare this to an identical system running an identical kernel, but  
the older BIOS:

[rordway at selene ~]$ uname -a
Linux selene 2.6.18-53.1.4.el5.centos.plus #1 SMP Fri Dec 7 07:05:12  
EST 2007 x86_64 x86_64 x86_64 GNU/Linux
[rordway at selene ~]$ cat /proc/interrupts
            CPU0       CPU1       CPU2       CPU3
   0:  415472769          0          0          0    IO-APIC-edge  timer
   1:          2          0          0          0    IO-APIC-edge  i8042
   4:        270          0         10         17    IO-APIC-edge   
serial
   8:          1          0          0          0    IO-APIC-edge  rtc
   9:          0          0          0          0   IO-APIC-level  acpi
  12:          4          0          0          0    IO-APIC-edge  i8042
  14:         25         74          0          0    IO-APIC-edge  ide0
  58:        323        464          0          0   IO-APIC-level   
ohci_hcd:usb1
  66:         26          0          0          0   IO-APIC-level   
ehci_hcd:usb2
  74:       5940     981922       7254        524   IO-APIC-level  ioc0
  82:       2026   53828037          0     138601   IO-APIC-level  eth0
  90:       2237   61820570          0     689058   IO-APIC-level  eth1
NMI:          0          0          0          0
LOC:  415433007  415433000  415432928  415432856
ERR:          0
MIS:          0

Note the huge ERR count with XT-APIC, and ioc0 and eth1 sharing an  
interrupt (SAS controller and my private network interface)

Does anyone know if this has been fixed in the mainline kernel, and if  
so if this can be integrated into the CentOS 5.1 kernel (namely the  
CentOS Plus kernel)?

Thanks!

Ryan

--
Ryan Ordway                           E-mail: rordway at oregonstate.edu
Unix Systems Administrator               rordway at library.oregonstate.edu
OSU Libraries, Corvallis, OR 97331    Office: Valley Library #4657