[CentOS] Page allocation failure and slow system

Thu Sep 21 17:33:23 UTC 2006
Steve Bergman <steve at rueb.com>

Hi all.

Every few days I am getting a message that looks like the one I have cut
and pasted below.

It is accompanied by a severe slowdown of the system.  In fact, it's
practically locked.

The machine runs gnome-desktops for 40 users plus about a hundred
sessions of a curses based point of sale software, plus a few other
functions, so it's a busy machine. Dual Xeon 3.2GHz processors.  4096MB
ram.  8192MB swap.

During this period, I'm seeing lots of swapout.  sar reports 266 pages
per second averaged over that 10 minute period.

I'm thinking that I have a runaway process, but have not been able to
see one of these instances for myself at the time it is happening.

I've set up a monitor to record the highest rss size process minute by
minute, which should tell me more about what is happening.

But I don't really understand what the message below is saying.  It
looks like I still have plenty of swap left, so why the page allocation
failure?

Thanks for any enlightenment.  VM isn't my strong suit.

-Steve


=======

Sep 20 13:34:33 hagar kernel: metacity: page allocation failure.
order:0, mode:0x50
Sep 20 13:34:33 hagar kernel:  [<c0144257>] __alloc_pages+0x28b/0x29d
Sep 20 13:34:33 hagar kernel:  [<c01408da>] find_lock_page+0x96/0x9d
Sep 20 13:34:33 hagar kernel:  [<c014091a>] find_or_create_page
+0x39/0x72
Sep 20 13:34:33 hagar kernel:  [<c015cde3>] grow_dev_page+0x26/0x102
Sep 20 13:34:33 hagar kernel:  [<c015cf8e>] __getblk_slow+0xcf/0xf3
Sep 20 13:34:33 hagar kernel:  [<c015d285>] __getblk+0x3f/0x49
Sep 20 13:34:33 hagar kernel:  [<c015d2ca>] __bread+0x9/0x1e
Sep 20 13:34:33 hagar kernel:  [<f88ef15b>] ext3_get_branch+0x63/0xc6
[ext3]
Sep 20 13:34:33 hagar kernel:  [<f88ef705>] ext3_get_block_handle
+0x9e/0x27d [ext3]
Sep 20 13:34:33 hagar kernel:  [<f88838f8>] __split_bio+0xfd/0x103
[dm_mod]
Sep 20 13:34:33 hagar kernel:  [<f88ef948>] ext3_get_block+0x64/0x6c
[ext3]
Sep 20 13:34:33 hagar kernel:  [<c015d664>] __block_write_full_page
+0x107/0x2ea
Sep 20 13:35:06 hagar kernel:  [<f88ef8e4>] ext3_get_block+0x0/0x6c
[ext3]
Sep 20 13:35:11 hagar kernel:  [<c015eabc>] block_write_full_page
+0xc5/0xce
Sep 20 13:35:14 hagar kernel:  [<f88ef8e4>] ext3_get_block+0x0/0x6c
[ext3]
Sep 20 13:35:21 hagar kernel:  [<f88f025a>] ext3_ordered_writepage
+0xce/0x13a [ext3]
Sep 20 13:35:23 hagar kernel:  [<f88f016c>] bget_one+0x0/0x7 [ext3]
Sep 20 13:35:25 hagar kernel:  [<c01499b5>] pageout+0x8d/0xcc
Sep 20 13:35:25 hagar kernel:  [<c0149bfb>] shrink_list+0x207/0x3ed
Sep 20 13:35:26 hagar kernel:  [<c0148e8c>] __pagevec_release+0x15/0x1d
Sep 20 13:35:27 hagar kernel:  [<c0149fbe>] shrink_cache+0x1dd/0x34d
Sep 20 13:35:28 hagar kernel:  [<c014a67c>] shrink_zone+0xa7/0xb6
Sep 20 13:35:29 hagar kernel:  [<c014a6d7>] shrink_caches+0x4c/0x57
Sep 20 13:35:30 hagar kernel:  [<c014a7ce>] try_to_free_pages+0xc3/0x1a7
Sep 20 13:35:30 hagar kernel:  [<c0144181>] __alloc_pages+0x1b5/0x29d
Sep 20 13:35:30 hagar kernel:  [<c0144281>] __get_free_pages+0x18/0x24
Sep 20 13:35:31 hagar kernel:  [<c016b3a2>] __pollwait+0x2d/0x95
Sep 20 13:35:35 hagar kernel:  [<c029d7c1>] tcp_poll+0x31/0x146
Sep 20 13:35:35 hagar kernel:  [<c0278741>] sock_poll+0x12/0x14
Sep 20 13:35:36 hagar kernel:  [<c016b645>] do_select+0x196/0x2c6
Sep 20 13:35:37 hagar kernel:  [<c016b375>] __pollwait+0x0/0x95
Sep 20 13:35:37 hagar kernel:  [<c016ba68>] sys_select+0x2e0/0x43a
Sep 20 13:35:38 hagar kernel:  [<c02d47bf>] syscall_call+0x7/0xb
Sep 20 13:35:38 hagar kernel:  [<c02d007b>] packet_rcv+0x18a/0x307
Sep 20 13:35:38 hagar kernel: Mem-info:
Sep 20 13:35:40 hagar kernel: DMA per-cpu:
Sep 20 13:35:41 hagar kernel: cpu 0 hot: low 2, high 6, batch 1
Sep 20 13:35:42 hagar kernel: cpu 0 cold: low 0, high 2, batch 1
Sep 20 13:35:42 hagar kernel: cpu 1 hot: low 2, high 6, batch 1
Sep 20 13:35:42 hagar kernel: cpu 1 cold: low 0, high 2, batch 1
Sep 20 13:35:43 hagar kernel: cpu 2 hot: low 2, high 6, batch 1
Sep 20 13:35:44 hagar kernel: cpu 2 cold: low 0, high 2, batch 1
Sep 20 13:35:45 hagar kernel: cpu 3 hot: low 2, high 6, batch 1
Sep 20 13:35:45 hagar kernel: cpu 3 cold: low 0, high 2, batch 1
Sep 20 13:35:45 hagar kernel: Normal per-cpu:
Sep 20 13:35:46 hagar kernel: cpu 0 hot: low 32, high 96, batch 16
Sep 20 13:35:46 hagar kernel: cpu 0 cold: low 0, high 32, batch 16
Sep 20 13:35:46 hagar kernel: cpu 1 hot: low 32, high 96, batch 16
Sep 20 13:35:46 hagar kernel: cpu 1 cold: low 0, high 32, batch 16
Sep 20 13:35:46 hagar kernel: cpu 2 hot: low 32, high 96, batch 16
Sep 20 13:35:46 hagar kernel: cpu 2 cold: low 0, high 32, batch 16
Sep 20 13:35:47 hagar kernel: cpu 3 hot: low 32, high 96, batch 16
Sep 20 13:35:47 hagar kernel: cpu 3 cold: low 0, high 32, batch 16
Sep 20 13:35:47 hagar kernel: HighMem per-cpu:
Sep 20 13:35:48 hagar kernel: cpu 0 hot: low 32, high 96, batch 16
Sep 20 13:35:48 hagar kernel: cpu 0 cold: low 0, high 32, batch 16
Sep 20 13:35:49 hagar kernel: cpu 1 hot: low 32, high 96, batch 16
Sep 20 13:35:50 hagar kernel: cpu 1 cold: low 0, high 32, batch 16
Sep 20 13:35:50 hagar kernel: cpu 2 hot: low 32, high 96, batch 16
Sep 20 13:35:50 hagar kernel: cpu 2 cold: low 0, high 32, batch 16
Sep 20 13:35:50 hagar kernel: cpu 3 hot: low 32, high 96, batch 16
Sep 20 13:35:51 hagar kernel: cpu 3 cold: low 0, high 32, batch 16
Sep 20 13:35:51 hagar kernel:
Sep 20 13:35:51 hagar kernel: Free pages:       13396kB (832kB HighMem)
Sep 20 13:35:52 hagar kernel: Active:868328 inactive:54456 dirty:31
writeback:18483 unstable:0 free:3349 slab:48649 mapped:902103
pagetables:28104
Sep 20 13:35:52 hagar kernel: DMA free:12564kB min:16kB low:32kB
high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:56093
all_unreclaimable? yes
Sep 20 13:35:52 hagar kernel: protections[]: 0 0 0
Sep 20 13:35:52 hagar kernel: Normal free:0kB min:928kB low:1856kB
high:2784kB active:441184kB inactive:155496kB present:901120kB
pages_scanned:4059 all_unreclaimable? no
Sep 20 13:35:52 hagar kernel: protections[]: 0 0 0
Sep 20 13:35:52 hagar kernel: HighMem free:832kB min:512kB low:1024kB
high:1536kB active:3032128kB inactive:62328kB present:3866624kB
pages_scanned:99 all_unreclaimable? no
Sep 20 13:35:52 hagar kernel: protections[]: 0 0 0
Sep 20 13:35:53 hagar kernel: DMA: 5*4kB 4*8kB 4*16kB 3*32kB 3*64kB
1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12564kB
Sep 20 13:35:53 hagar kernel: Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
Sep 20 13:35:53 hagar kernel: HighMem: 32*4kB 12*8kB 6*16kB 0*32kB
0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 832kB
Sep 20 13:35:53 hagar kernel: Swap cache: add 2602136, delete 2559582,
find 710669/889146, race 2+44
Sep 20 13:35:54 hagar kernel: 936 bounce buffer pages
Sep 20 13:35:54 hagar kernel: Free swap:       7862256kB
Sep 20 13:35:54 hagar kernel: 1196032 pages of RAM
Sep 20 13:35:54 hagar kernel: 802444 pages of HIGHMEM
Sep 20 13:35:55 hagar kernel: 175029 reserved pages
Sep 20 13:35:55 hagar kernel: 1585426 pages shared
Sep 20 13:35:55 hagar kernel: 42690 pages swap cached