This is a A8N32-SLI Deluxe motherboard with a AMD Athlon(tm) 64 X2 Dual Core 4400+ Processor. It freezes and cannot be accessed from the network. Keyboard/mouse/display, everything just stuck. It had an uptime of 17 days after I re-connected all cables and boards after a similar crash.
My feeling is, this must be a hardware related problem. I inspected the logs under /var/log, but nothing.
What would be the best approach to debug this problem? What can be the most likely cause? Or should I just accept it (it's a single user desktop machine). The machine has FC4 installed. Is there a chance that it becomes more stable with Centos (as said, I actually don't suspect the OS). For new machines I started installing Centos (and I'm happy with it, especially the fact that my EDA software vendors support it well). I monitor the machine with Zabbix, but no weird things there. Just before the freeze the machine had a processor load of 1 running a user simulation. Any suggestions would be welcome.
Thanks, Theo
Theo Band wrote:
What would be the best approach to debug this problem? What can be the most likely cause? Or should I just accept it (it's a single user desktop machine). The machine has FC4 installed. Is there a chance that it becomes more stable with Centos (as said, I actually don't suspect the OS). For new machines I started installing Centos (and I'm happy with it, especially the fact that my EDA software vendors support it well). I monitor the machine with Zabbix, but no weird things there. Just before the freeze the machine had a processor load of 1 running a user simulation.
Turn on nmi_watchdog: http://www.mjmwired.net/kernel/Documentation/nmi_watchdog.txt#34
run memtest86+ for 24 to 48 hours.
Stress your disks: for i in $(seq -w 20); do cp -ax / /tmp/$i & done
(Be sure to stop the 20 cp processes before your disk(s) fill up)
Mike