On Dec 18, 2006, at 16:17, Mark Belanger wrote:
I have many different centos machines that are hanging regulary. I believe this is due to something our application is doing - not a centos specific problem.
I have the same problem. I even posted something to this list titled "Strange system hangs" on 11/27 but didn't get any responses.
When the machines hang, there is no access to the console or remote access(ssh, rsh, etc).
I have that symptom as well. No way to do any debugging after it gets into that state. So I added the following two lines to the /etc/ syslog.conf file:
kern.* @<central server> *.info;mail.none;authpriv.none;cron.none @<central server>
Should I add any other levels to the selector field? BTW, my systems are running completely stock CentOS distribution EXCEPT for the binary nVidia driver, which was the only way I could get these systems to drive the 20" LCD displays at their native 1600x1200 resolution using the correct refresh rate.
I had another report of a hang this morning, but in this case even though the machine appears frozen (the screen saver is stuck and I can't get to the alternate consoles), I can in fact log into the machine remotely and top shows me that the X server is using 100% of the CPU:
top - 08:44:22 up 10 days, 23:00, 10 users, load average: 1.04, 1.01, 1.00 Tasks: 115 total, 2 running, 113 sleeping, 0 stopped, 0 zombie Cpu(s): 99.7% us, 0.3% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 3113468k total, 1361240k used, 1752228k free, 87312k buffers Swap: 3047416k total, 0k used, 3047416k free, 957756k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4381 root 25 0 67748 42m 7776 R 99.8 1.4 782:53.37 X
I also see the following in /var/log/messages:
Dec 18 19:56:02 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000001 Dec 18 19:56:03 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:09 hepdsw04 Synergy 1.3.1: NOTE: CServerProxy.cpp, 315: server is dead Dec 18 19:56:10 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000020 Dec 18 19:56:11 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:18 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000020 Dec 18 19:56:19 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:26 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000020 Dec 18 19:56:27 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:34 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000001
What is the meaning of the NVRM entries? The Synergy entry is from the keyboard/mouse sharing Synergy utility (great program BTW, I couldn't live without it).
Anyway, sorry to inject my own problems into this thread, but maybe these hangs are all related.
Alfred