[CentOS] Centos4 SMP Kernel OOM

Tue May 31 22:17:42 UTC 2005
Maciej Żenczykowski <maze at cela.pl>

Hello,

I've just run out of memory on a dual xeon with 5GB ram,
considering there should have been around 4GB free (not counting 
buffers and cache)... this is unusual.

Now after it OOM'ed I tried running top and memory usage was fine
(around 1GB of 5, no swap usage of 12GB).

So I thought it was a temporary thing, but processes kept on
OOM'ing for no understandable reason...
while swap was empty and memory continued to show 3GB free
(and look at the weird log messages...)

A reboot helped though, but still... :)

[this is the normal CentOS4 i686 SMP kernel 2.6.9-5.0.5.ELsmp]

May 31 22:31:25 tcs kernel: oom-killer: gfp_mask=0xd0
May 31 22:31:25 tcs kernel: DMA per-cpu:
May 31 22:31:25 tcs kernel: cpu 0 hot: low 2, high 6, batch 1
May 31 22:31:25 tcs kernel: cpu 0 cold: low 0, high 2, batch 1
May 31 22:31:25 tcs kernel: cpu 1 hot: low 2, high 6, batch 1
May 31 22:31:25 tcs kernel: cpu 1 cold: low 0, high 2, batch 1
May 31 22:31:25 tcs kernel: cpu 2 hot: low 2, high 6, batch 1
May 31 22:31:25 tcs kernel: cpu 2 cold: low 0, high 2, batch 1
May 31 22:31:25 tcs kernel: cpu 3 hot: low 2, high 6, batch 1
May 31 22:31:25 tcs kernel: cpu 3 cold: low 0, high 2, batch 1
May 31 22:31:25 tcs kernel: Normal per-cpu:
May 31 22:31:25 tcs kernel: cpu 0 hot: low 32, high 96, batch 16
May 31 22:31:25 tcs kernel: cpu 0 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs postfix: Process did not exit cleanly, returned 0 with 
signal 9
May 31 22:31:26 tcs kernel: cpu 1 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 1 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel: cpu 2 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 2 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel: cpu 3 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 3 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel: HighMem per-cpu:
May 31 22:31:26 tcs kernel: cpu 0 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 0 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel: cpu 1 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 1 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel: cpu 2 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 2 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel: cpu 3 hot: low 32, high 96, batch 16
May 31 22:31:26 tcs kernel: cpu 3 cold: low 0, high 32, batch 16
May 31 22:31:26 tcs kernel:
May 31 22:31:26 tcs kernel: Free pages:     3579344kB (3578432kB HighMem)
May 31 22:31:26 tcs kernel: Active:96900 inactive:20481 dirty:0 
writeback:0 unstable:0 free:894836 slab:212200 mapped:86012 pagetabl
es:1438
May 31 22:31:26 tcs kernel: DMA free:16kB min:16kB low:32kB high:48kB 
active:16kB inactive:0kB present:16384kB
May 31 22:31:26 tcs kernel: protections[]: 0 0 0
May 31 22:31:26 tcs kernel: Normal free:896kB min:936kB low:1872kB 
high:2808kB active:240kB inactive:416kB present:901120kB
May 31 22:31:26 tcs kernel: protections[]: 0 0 0
May 31 22:31:26 tcs kernel: HighMem free:3578432kB min:512kB low:1024kB 
high:1536kB active:387344kB inactive:81508kB present:4325372
kB
May 31 22:31:26 tcs kernel: protections[]: 0 0 0
May 31 22:31:26 tcs kernel: DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB
May 31 22:31:26 tcs kernel: Normal: 0*4kB 0*8kB 2*16kB 1*32kB 1*64kB 
0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 896kB
May 31 22:31:26 tcs kernel: HighMem: 1560*4kB 5190*8kB 4683*16kB 1632*32kB 
2304*64kB 1512*128kB 781*256kB 349*512kB 157*1024kB 66*20
48kB 583*4096kB = 3578432kB
May 31 22:31:26 tcs kernel: Swap cache: add 0, delete 0, find 0/0, race 
0+0
May 31 22:31:26 tcs kernel: Out of Memory: Killed process 11250 
(MailScanner).

and again moments later...

May 31 22:31:30 tcs kernel: oom-killer: gfp_mask=0xd0
...

and repeat two dozen or more times for different processes.

Cheers,
MaZe.