[CentOS] Server hangs on CentOS 5.5

Wed Mar 9 18:14:57 UTC 2011
m.roth at 5-cent.us <m.roth at 5-cent.us>

Michael Eager wrote:
> m.roth at 5-cent.us wrote:
>> Michael Eager wrote:
>>> John Hodrien wrote:
>>>> On Wed, 9 Mar 2011, Michael Eager wrote:
>> <snip>
>> Here's one more, off-the-wall thought: do the setterm --powersave off,
>> and find some way to make it work, so that you can see what's on the
screen
>> when it dies.
>
> Yes, I did this.  Switched to console screen.  The correct command
> is "setterm -powersave off -blank off", otherwise the screen gets
> blanked.  Turned the monitor off.  I hope it shows something
> useful on the next fault.

Best of luck. And thanks, I may try that.
>
>> What may be very important here is I recently had a problem
>> with a honkin' big server crashing... and it turned out that a user was
>> running a parallel processing job that kicked off three? four? dozen
>> threads, and towards the end of the job, every single thread wanted
>> 10G... on a system with 256G RAM (which size still boggles my mind). The
>> OOM-Killer didn't even have a chance to do its thing.... Yes, he's
>> limited what his job requests, and the system hasn't crashed since.
>
> Strange.  OOM-Killer should get priority.  That's what it's for.
> Although it usually seems to kill the innocent bystanders before
> it gets around to killing the offenders.

Yeah, but apparently too many of them hit too quickly - that's all I can
think.

          mark