Hello,
I've just run out of memory on a dual xeon with 5GB ram, considering there should have been around 4GB free (not counting buffers and cache)... this is unusual.
Now after it OOM'ed I tried running top and memory usage was fine (around 1GB of 5, no swap usage of 12GB).
So I thought it was a temporary thing, but processes kept on OOM'ing for no understandable reason... while swap was empty and memory continued to show 3GB free (and look at the weird log messages...)
A reboot helped though, but still... :)
[this is the normal CentOS4 i686 SMP kernel 2.6.9-5.0.5.ELsmp]
May 31 22:31:25 tcs kernel: oom-killer: gfp_mask=0xd0 May 31 22:31:25 tcs kernel: DMA per-cpu: May 31 22:31:25 tcs kernel: cpu 0 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 0 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: cpu 1 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 1 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: cpu 2 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 2 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: cpu 3 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 3 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: Normal per-cpu: May 31 22:31:25 tcs kernel: cpu 0 hot: low 32, high 96, batch 16 May 31 22:31:25 tcs kernel: cpu 0 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs postfix: Process did not exit cleanly, returned 0 with signal 9 May 31 22:31:26 tcs kernel: cpu 1 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 1 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 2 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 2 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 3 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 3 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: HighMem per-cpu: May 31 22:31:26 tcs kernel: cpu 0 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 0 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 1 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 1 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 2 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 2 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 3 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 3 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: May 31 22:31:26 tcs kernel: Free pages: 3579344kB (3578432kB HighMem) May 31 22:31:26 tcs kernel: Active:96900 inactive:20481 dirty:0 writeback:0 unstable:0 free:894836 slab:212200 mapped:86012 pagetabl es:1438 May 31 22:31:26 tcs kernel: DMA free:16kB min:16kB low:32kB high:48kB active:16kB inactive:0kB present:16384kB May 31 22:31:26 tcs kernel: protections[]: 0 0 0 May 31 22:31:26 tcs kernel: Normal free:896kB min:936kB low:1872kB high:2808kB active:240kB inactive:416kB present:901120kB May 31 22:31:26 tcs kernel: protections[]: 0 0 0 May 31 22:31:26 tcs kernel: HighMem free:3578432kB min:512kB low:1024kB high:1536kB active:387344kB inactive:81508kB present:4325372 kB May 31 22:31:26 tcs kernel: protections[]: 0 0 0 May 31 22:31:26 tcs kernel: DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB May 31 22:31:26 tcs kernel: Normal: 0*4kB 0*8kB 2*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 896kB May 31 22:31:26 tcs kernel: HighMem: 1560*4kB 5190*8kB 4683*16kB 1632*32kB 2304*64kB 1512*128kB 781*256kB 349*512kB 157*1024kB 66*20 48kB 583*4096kB = 3578432kB May 31 22:31:26 tcs kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0 May 31 22:31:26 tcs kernel: Out of Memory: Killed process 11250 (MailScanner).
and again moments later...
May 31 22:31:30 tcs kernel: oom-killer: gfp_mask=0xd0 ...
and repeat two dozen or more times for different processes.
Cheers, MaZe.
Could you please specify the exact kernel you are using? Have you tried making the hardware test for memory ? Did you notice anything responding slow ? Try taking out 3 GB and let it work with 2 GB does it still give the same errors ?
On Wed, 2005-06-01 at 00:17 +0200, Maciej Żenczykowski wrote:
Hello,
I've just run out of memory on a dual xeon with 5GB ram, considering there should have been around 4GB free (not counting buffers and cache)... this is unusual.
Now after it OOM'ed I tried running top and memory usage was fine (around 1GB of 5, no swap usage of 12GB).
So I thought it was a temporary thing, but processes kept on OOM'ing for no understandable reason... while swap was empty and memory continued to show 3GB free (and look at the weird log messages...)
A reboot helped though, but still... :)
[this is the normal CentOS4 i686 SMP kernel 2.6.9-5.0.5.ELsmp]
May 31 22:31:25 tcs kernel: oom-killer: gfp_mask=0xd0 May 31 22:31:25 tcs kernel: DMA per-cpu: May 31 22:31:25 tcs kernel: cpu 0 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 0 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: cpu 1 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 1 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: cpu 2 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 2 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: cpu 3 hot: low 2, high 6, batch 1 May 31 22:31:25 tcs kernel: cpu 3 cold: low 0, high 2, batch 1 May 31 22:31:25 tcs kernel: Normal per-cpu: May 31 22:31:25 tcs kernel: cpu 0 hot: low 32, high 96, batch 16 May 31 22:31:25 tcs kernel: cpu 0 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs postfix: Process did not exit cleanly, returned 0 with signal 9 May 31 22:31:26 tcs kernel: cpu 1 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 1 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 2 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 2 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 3 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 3 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: HighMem per-cpu: May 31 22:31:26 tcs kernel: cpu 0 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 0 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 1 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 1 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 2 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 2 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: cpu 3 hot: low 32, high 96, batch 16 May 31 22:31:26 tcs kernel: cpu 3 cold: low 0, high 32, batch 16 May 31 22:31:26 tcs kernel: May 31 22:31:26 tcs kernel: Free pages: 3579344kB (3578432kB HighMem) May 31 22:31:26 tcs kernel: Active:96900 inactive:20481 dirty:0 writeback:0 unstable:0 free:894836 slab:212200 mapped:86012 pagetabl es:1438 May 31 22:31:26 tcs kernel: DMA free:16kB min:16kB low:32kB high:48kB active:16kB inactive:0kB present:16384kB May 31 22:31:26 tcs kernel: protections[]: 0 0 0 May 31 22:31:26 tcs kernel: Normal free:896kB min:936kB low:1872kB high:2808kB active:240kB inactive:416kB present:901120kB May 31 22:31:26 tcs kernel: protections[]: 0 0 0 May 31 22:31:26 tcs kernel: HighMem free:3578432kB min:512kB low:1024kB high:1536kB active:387344kB inactive:81508kB present:4325372 kB May 31 22:31:26 tcs kernel: protections[]: 0 0 0 May 31 22:31:26 tcs kernel: DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB May 31 22:31:26 tcs kernel: Normal: 0*4kB 0*8kB 2*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 896kB May 31 22:31:26 tcs kernel: HighMem: 1560*4kB 5190*8kB 4683*16kB 1632*32kB 2304*64kB 1512*128kB 781*256kB 349*512kB 157*1024kB 66*20 48kB 583*4096kB = 3578432kB May 31 22:31:26 tcs kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0 May 31 22:31:26 tcs kernel: Out of Memory: Killed process 11250 (MailScanner).
and again moments later...
May 31 22:31:30 tcs kernel: oom-killer: gfp_mask=0xd0 ...
and repeat two dozen or more times for different processes.
OK ... there are known memory leak issues on that kernel
Your problem is similar, but not exactly like:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=150971
I can give you a kernel (beta release U1 kernel) OR you can install a beta kernel from the redhat developer to test if it fixes your issue:
http://people.redhat.com/davej/kernels/RHEL4/RPMS.kernel/
(the kernel I have built it 2.6.9-6.37 ... it has it's own issues. The kernel issues seem to be what is holding up Update 1)