[CentOS] out of memory

Thu Jul 31 17:16:07 UTC 2008
Craig White <craigwhite at azapple.com>

On Thu, 2008-07-31 at 06:47 -0400, William L. Maltby wrote:
> On Wed, 2008-07-30 at 19:55 -0700, Craig White wrote:
> > <snip>
> 
> > I suppose I could run some type of cron script that does something
> > like...
> > 
> > top -n 1 -b >> /tmp/top.log
> > 
> > so if it happens again, I get a memory snapshot history...is there a
> > better idea?
> 
> If you have the sar packages installed the available reports will nail
> it for you.
----
hmmm....seems pretty clear that I've got something leaking memory

from this morning (sar -r)
06:30:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree
kbswpused  %swpused  kbswpcad
06:40:01 AM     17456   1017668     98.31     23520    222468   1600880
430728     21.20    133388

1600880 kbswpfree

from yesterday the 30 minutes to the moment of death...
05:00:01 PM     22672   1012468     97.81     35868    131760      1052
2030556     99.95     29452
05:10:01 PM     16228   1018912     98.43     31596    167148       108
2031500     99.99     12288
05:20:02 PM     12136   1023004     98.83     55064     76868      6860
2024748     99.66     55768
05:30:01 PM     12472   1022668     98.80     18608     81296         0
2031608    100.00     48364

So you can see that kbswpfree went from 1052 => 108 => 6850 => 0

and on July 25 (two days before I updated to 5.2) but there were users
in the office (same time period)...
05:00:01 PM     21092   1014048     97.96     47580    133536     82320
1949288     95.95     67468
05:10:01 PM     50332    984808     95.14     60632    107352     83560
1948048     95.89     49740
05:20:01 PM     26060   1009080     97.48     51484    123264     87192
1944416     95.71     56560
05:30:01 PM     55480    979660     94.64     24660    123368     87952
1943656     95.67     58716

but on July 27 - the day I updated - no users in office - same time
period, the kbswpfree starting swinging wildly.

But sar doesn't tell me which program is leaking memory but perhaps it
was just the update without reboot that was the issue.

Craig