Hi John,
I'll have a look a that. This seems odd because, if I understand correctly, those settings would only affect if/when the system is idle and the lockups occur during regular/busy hours.
BUT... they should be off anyway.
On Mon, Dec 27, 2010 at 5:34 PM, John Plemons john@mavin.com wrote:
Try turning off the green features completely on the board.. Never allow the board to go to sleep, don't even let the board put the monitor into power saving mode..
John
On 12/27/2010 4:19 PM, John R Pierce wrote:
On 12/27/10 11:04 AM, robert mena wrote:
Hi,
I've installed Centos 5.5 (plus updates) in a machine with INTEL DP43BF motherboard. In order to make Linux detect the PCIs I've added the pci=assign-busses in my GRUB conf.
Everything runs fine but within less than 2 days of uptime the machine simply freezes (black console no connectivity). This has happened more than one time so I'm considering to be a problem. The memtest passed without a problem and the machine uses a compact flash (sandisk extreme III 4GB) as a disk.
I could only find the error messages in my /var/log/messages but those appear hours before the actual lock.
kernel: 0000:00:1a.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
kernel: 0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
kernel: eth4: PCI Bus error a290.
kernel: eth4: PCI Bus error 0290.
kernel: eth3: PCI Bus error 2290.
kernel: eth3: PCI Bus error 0290.
Any tips?
thats a desktop board, right? so it probably doesn't have ECC or any of the other system integrity features of a server board, nor do they usually have the IO bus bandwidth to handle substantial IO workloads.
PCI bus errors are not a good thing at all, either. you have 5 ethernet adapters in use? what sort of Ethernet controller? I believe those PCI Bus errors are being reported by your ethernet adapters, and could be the result of excess bus contention. a single gigE can way more than saturate a 32bit 33Mhz PCI (parallel) bus. All the PCI slots on a desktop board like you have are on the same bus and contend for the same bandwidth.
Also, as mentioned thermal problems are a definite possibility, although Intel CPUs tend to self-throttle if they get too hot, the Chipset might not be that good at it (eg, watch the chipset and memory temperature as well as the CPU). Another possible cause would be silent memory corruption although that would be more likely to cause a kernel fault ("Fatal kernel error - system halted") however if your display is in a GUI mode, you won't see this unless the console is directed to a serial port which is being monitored.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1170 / Virus Database: 426/3341 - Release Date: 12/26/10
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos