On Wed, 29 Jun 2011, Keith Keller wrote:
In addition to the suggestions already made, one possibility is to attach a serial console or IP KVM. Logging in may still be awful, but at least you won't have to go through sshd. I've been able to log in through a serial getty when sshd was not responding or taking too long (this works maybe 50-75% of the time; the rest of the time it's too late, and even getty is unresponsive). You have the added advantage of being able to log in directly as root if you have PermitRootLogin no in your sshd_config.
Even with OOB console access, there's still the problem of /bin/login timing out on highly loaded servers. The login.c source in the util-linux package hardwires the login timeout to 60 seconds. If your server can't process the login request in under a minute (not unusual if the load average is high and/or the machine is using swap), you can't login via *any* console.
So if killing the machine doesn't appeal to you, you still need OOB console access plus
* a patched version of /bin/login with a longer timeout, or * a process-watcher that aggressively kills known troublemakers, or * a remotely accessible console that never logs out.
I actually relied for a while on the last choice. I had a remotely accessible root shell that never logged out. When things got sluggish, I was able to /bin/kill to my heart's content. It wasn't a pretty solution, but it kept me running until I was able to solve the problem properly.