Try turning off the green features completely on the board.. Never allow the board to go to sleep, don't even let the board put the monitor into power saving mode.. John On 12/27/2010 4:19 PM, John R Pierce wrote: > On 12/27/10 11:04 AM, robert mena wrote: >> Hi, >> >> I've installed Centos 5.5 (plus updates) in a machine with INTEL >> DP43BF motherboard. In order to make Linux detect the PCIs I've added >> the pci=assign-busses in my GRUB conf. >> >> Everything runs fine but within less than 2 days of uptime the machine >> simply freezes (black console no connectivity). This has happened >> more than one time so I'm considering to be a problem. The memtest >> passed without a problem and the machine uses a compact flash (sandisk >> extreme III 4GB) as a disk. >> >> I could only find the error messages in my /var/log/messages but those >> appear hours before the actual lock. >> >> kernel: 0000:00:1a.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001 >> >> kernel: 0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001 >> >> >> kernel: eth4: PCI Bus error a290. >> >> kernel: eth4: PCI Bus error 0290. >> >> kernel: eth3: PCI Bus error 2290. >> >> kernel: eth3: PCI Bus error 0290. >> >> >> Any tips? >> > > thats a desktop board, right? so it probably doesn't have ECC or any of > the other system integrity features of a server board, nor do they > usually have the IO bus bandwidth to handle substantial IO workloads. > > PCI bus errors are not a good thing at all, either. you have 5 ethernet > adapters in use? what sort of Ethernet controller? I believe those > PCI Bus errors are being reported by your ethernet adapters, and could > be the result of excess bus contention. a single gigE can way more than > saturate a 32bit 33Mhz PCI (parallel) bus. All the PCI slots on a > desktop board like you have are on the same bus and contend for the same > bandwidth. > > Also, as mentioned thermal problems are a definite possibility, although > Intel CPUs tend to self-throttle if they get too hot, the Chipset might > not be that good at it (eg, watch the chipset and memory temperature as > well as the CPU). Another possible cause would be silent memory > corruption although that would be more likely to cause a kernel fault > ("Fatal kernel error - system halted") however if your display is in a > GUI mode, you won't see this unless the console is directed to a serial > port which is being monitored. > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos > > > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1170 / Virus Database: 426/3341 - Release Date: 12/26/10 > >