[CentOS-virt] Xen crash

Tue Oct 9 09:25:51 UTC 2007
Nicolas Sahlqvist <nicco77 at gmail.com>

Hi Daniel,

I also got a full time work so I know how it is, on what URL can I
found the bug tracker?

I think there are 2 bugs, one being the CPU lockup and the 2nd where
Xen hangs under high load, this is what I saw when it hung last time:

printk: 220 messages suppressed.
httpd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

Call Trace:
 [<ffffffff802aee7d>] out_of_memory+0x4e/0x1d3
 [<ffffffff8020efe8>] __alloc_pages+0x229/0x2b2
 [<ffffffff8021298b>] __do_page_cache_readahead+0xd0/0x21c
 [<ffffffff802284d8>] sync_page+0x0/0x42
 [<ffffffff88082c61>] :dm_mod:dm_any_congested+0x38/0x3f
 [<ffffffff80213240>] filemap_nopage+0x148/0x322
 [<ffffffff80208b94>] __handle_mm_fault+0x3da/0xf46
 [<ffffffff802606e9>] _spin_lock_irqsave+0x9/0x14
 [<ffffffff80262fc8>] do_page_fault+0xe48/0x11dc
 [<ffffffff8022c50c>] mntput_no_expire+0x19/0x89
 [<ffffffff80245fbe>] sys_chdir+0x55/0x62
 [<ffffffff8025cb6f>] error_exit+0x0/0x6e

DMA per-cpu:
cpu 0 hot: high 186, batch 31 used:170
cpu 0 cold: high 62, batch 15 used:55
cpu 1 hot: high 186, batch 31 used:28
cpu 1 cold: high 62, batch 15 used:37
cpu 2 hot: high 186, batch 31 used:24
cpu 2 cold: high 62, batch 15 used:47
cpu 3 hot: high 186, batch 31 used:14
cpu 3 cold: high 62, batch 15 used:52


Call Trace:
 [<ffffffff802aee7d>] out_of_memory+0x4e/0x1d3
 [<ffffffff8020efe8>] __alloc_pages+0x229/0x2b2
 [<ffffffff8021298b>] __do_page_cache_readahead+0xd0/0x21c
 [<ffffffff8025f528>] __wait_on_bit_lock+0x5b/0x66
 [<ffffffff88082c61>] :dm_mod:dm_any_congested+0x38/0x3f
 [<ffffffff80213240>] filemap_nopage+0x148/0x322
 [<ffffffff80208b94>] __handle_mm_fault+0x3da/0xf46
 [<ffffffff802606e9>] _spin_lock_irqsave+0x9/0x14
 [<ffffffff80262fc8>] do_page_fault+0xe48/0x11dc
 [<ffffffff80233a31>] do_setitimer+0x45f/0x4c7
 [<ffffffff80245fbe>] sys_chdir+0x55/0x62
 [<ffffffff8025cb6f>] error_exit+0x0/0x6e

DMA per-cpu:
cpu 0 hot: high 186, batch 31 used:170
cpu 0 cold: high 62, batch 15 used:55
cpu 1 hot: high 186, batch 31 used:28
cpu 1 cold: high 62, batch 15 used:37
cpu 2 hot: high 186, batch 31 used:24
cpu 2 cold: high 62, batch 15 used:47
cpu 3 hot: high 186, batch 31 used:14
cpu 3 cold: high 62, batch 15 used:52
cpu 4 hot: high 186, batch 31 used:25
cpu 4 cold: high 62, batch 15 used:13
cpu 5 hot: high 186, batch 31 used:21
cpu 5 cold: high 62, batch 15 used:54
cpu 6 hot: high 186, batch 31 used:17
cpu 6 cold: high 62, batch 15 used:45
cpu 7 hot: high 186, batch 31 used:17
cpu 7 cold: high 62, batch 15 used:45
DMA32 per-cpu: empty
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        6020kB (0kB HighMem)
Active:300020 inactive:204495 dirty:0 writeback:0 unstable:0 free:1505
slab:7395 mapped:2 pagetables:24274
DMA free:6020kB min:6020kB low:7524kB high:9028kB active:1200080kB
inactive:817980kB present:2265088kB pages_scanned:19172661
all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB
inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 65*4kB 0*8kB 0*16kB 8*32kB 2*64kB 0*128kB 1*256kB 0*512kB
1*1024kB 0*2048kB 1*4096kB = 6020kB
DMA32: empty
Normal: empty
HighMem: empty
Swap cache: add 217269, delete 217269, find 2978/10987, race 0+1
Free swap  = 0kB
Total swap = 557048kB

All swap is used so no memory free, this would make a non virtual box
very slow (almost dead), but not crash like this so could it be
something wrong moving swap pages back and forth that causes this
problem..?


Regards,
Nicolas Sahlqvist
CouchSurfing.com


On 10/9/07, Daniel de Kok <danieldk at pobox.com> wrote:
> On Tue, 2007-10-09 at 10:47 +0200, Nicolas Sahlqvist wrote:
> > So you are saying that we should run more zones with less CPU's to
> > avoid the problem? Well, the budget for RAM would not go too well with
> > that so is there any work done on a fix that you are aware of?
>
> Well, I was rather wondering if it is the same bug. Even if so, it
> should really be filed in our and the upstream bug trackers. I should
> have done that myself, but have had little time to do so.
>
> -- Daniel
>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> http://lists.centos.org/mailman/listinfo/centos-virt
>