========================================================================
Memory for crash kernel (0x0 to 0x0) notwithin permissible range ..MP-BIOS bug: 8254 timer not connected to IO-APIC Red Hat nash version 5.1.19.6 starting Welcome to CentOS release 5 (Final) .... ..... and continues normal booting.
2008/2/6 Ross S. W. Walker rwalker@medallion.com:
I don't think that is the "harmless" error message mentioned in the release notes as that had to do with the "crash kernel".
I saw this same error on a Dell AMD system. It seems the motherboard in that system didn't do ACPI IRQ routing as the kernel expected and experienced a lot of random problems until "acpi=noirq" was passed as a kernel option to disable ACPI IRQ routing defaulting back to the APIC IRQ routing. If that still gives you problems then you may need to use "irq=poll" which forces the kernel to poll for IRQ changes.
At first, I am sorry for my late reply. I was very busy.
Well, "acpi=noirq" didn't work but after using the "irq=poll" option, the message "MP-BIOS bug: 8254 timer not connected to IO-APIC" stopped appearing. However, the message "Memory for crash kernel (0x0 to 0x0) notwithin permissible range" is still appearing. I have started my computation program after booting the OS with "irq=poll" option. I will report later if it really worked and system doesn't freez anymore after running the program for long time.
This is the grub.conf: kernel /boot/vmlinuz-2.6.18-53.el5PAE ro root=LABEL=/12 irq=poll early-login quiet. Also, the deamon "acpid" is not running. ============================================================================
============================================================================ 2008/2/6 Tru Huynh tru@centos.org:
Looks like some hardware crash to me, otherwise you would have some logs for oops/hangs.
Can you make available somewhere your /var/log/messages (don't send a few MB file to the list) and the /proc/cmdline content ?
You said you used "acpi=off" and acpid disabled is it still the case?
~> chkconfig --list cpuspeed cpuspeed 0:off 1:on 2:off 3:off 4:off 5:off 6:off
As far as the the kernel log message and content of "/proc/cmdline" is concerned, I will certainly make these available if the aforementioned "irq=poll" optioned also fails. And yes, the until last time, "acpi=off, noapic" options were passed to the kernel and acpid were kept stopped. The output of "chkconfig --list cpuspeed" is "cpuspeed 0:off 1:on 2:on 3:on 4:on 5:on 6:off". However, the "service cpuspeed start" or "service cpuspeed stop" commands doesn't show any message. Also, the gui to control the services (system-config-services) shows that cpuspeed is stopped. So, I guess, cpuspeed is of no effect. But anyway, I will report the details a little lated after I finish checking the "irq=poll" option. =================================================================
In the mean time, I also verified that it is NOT a hardware problem. I installed FC5 in one of the other partitions and ran a SERIAL version of the same program (i.e. no OpenMP, gcc without -fopenmp flag) and it didn't freez at all. Well, I had to pass the "noapic" option during this installation and it didn't recognize my network card ;). When I run the PARALLEL version of the program (gcc with -fopenmp option), it ran for few hours and stopped with an error message something like "libopenmp: not sufficient memory...allocating 60 bytes". However, the system didn't hang or didn't reboot. So, I believe, it has something to do with the OpenMP, not the hardware.
Anyway, thank you for all your replies. I will keep posting the updates here.
- Chandra