Re: [CentOS] out of memory

31 Jul 2008


      On Wed, 2008-07-30 at 22:19 -0400, Filipe Brandenburger wrote:
...
On Wed, Jul 30, 2008 at 20:31, Craig White craigwhite@azapple.com wrote:
...
how does one determine who the culprit was?
Very hard... the kernel tries to "guess" which process is causing the
issue, but from what I've seen (and I see OOMs every week) it guesses
wrong most of the time. In my case, the victim ends up being "nscd"
most of the time, even when I'm sure it's not using a lot of memory
nor leaking.
In my case, usually when I start having OOMs I have them on several
machines running the same programs (it's a grid) so it's more or less
easy to find the culprit by looking at the jobs that were running on
all affected machines.
In any case, my policy is to always reboot a machine after an OOM,
since it may be in an incoherent state.
----
well, I stopped using nscd a few years ago and it definitely is off
after the reboot and chkconfig says it shouldn't start by itself but I
put it in the realm of possible but unlikely.
I did update to 5.2 on Sunday and updated nss-ldap yesterday and today -
boink though I have no way to know what actually caused this as the logs
don't reveal enough as far as I can tell. The system has been up for
quite some time.
I suppose I could run some type of cron script that does something
like...
top -n 1 -b >> /tmp/top.log
so if it happens again, I get a memory snapshot history...is there a
better idea?
Craig

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] out of memory