[CentOS] out of memory

Thu Jul 31 02:19:08 UTC 2008
Filipe Brandenburger <filbranden at gmail.com>

On Wed, Jul 30, 2008 at 20:31, Craig White <craigwhite at azapple.com> wrote:
> how does one determine who the culprit was?

Very hard... the kernel tries to "guess" which process is causing the
issue, but from what I've seen (and I see OOMs every week) it guesses
wrong most of the time. In my case, the victim ends up being "nscd"
most of the time, even when I'm sure it's not using a lot of memory
nor leaking.

In my case, usually when I start having OOMs I have them on several
machines running the same programs (it's a grid) so it's more or less
easy to find the culprit by looking at the jobs that were running on
all affected machines.

In any case, my policy is to always reboot a machine after an OOM,
since it may be in an incoherent state.

HTH,
Filipe