[CentOS] XFS Issues

Wed Nov 8 17:17:40 UTC 2006
Stephen C. Rigler <srigler at marathonoil.com>

We are in the process of migrating XFS filesystems from one storage
array to another.  Both are arrays are mounted locally on the same
CentOS 4.4 system (x86_64).  We are running kernel 2.6.9-42.0.2.ELsmp
along with kernel-module-xfs-2.6.9-42.0.2.ELsmp-0.1-3.

The issue we are having is that while the copy is running (using rsync)
the system will log these message periodically:

kernel: XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x250)
kernel: XFS: possible memory allocation deadlock in kmem_alloc
(mode:0x2d0)

Eventually the system will panic.  This looks similar to this bug:
http://oss.sgi.com/bugzilla/show_bug.cgi?id=410 which was apparently
fixed in April.

Is there any chance that the fix will make it into the centosplus
kernel-module-xfs?

Thanks,
Steve

The output of the crash is below:
Nov  7 12:50:31 houla0  Unable to handle kernel paging request at
0000001116a20188 RIP:  
Nov  7 12:50:31 houla0  <ffffffff80309ec6>{schedule+2164} 
Nov  7 12:50:31 houla0  PML4 38432b067 PGD 0  
Nov  7 12:50:31 houla0  Oops: 0000 [1] SMP  
Nov  7 12:50:31 houla0  CPU 2  
Nov  7 12:50:31 houla0  Modules linked in: nfsd exportfs nfs lockd
nfs_acl md5 ipv6 netconsole netdump autofs4 i2c_dev i2c_core sunrpc
ipt_REJECT ipt_state iptable_filter ipt_MASQUERADE iptable_nat
ip_conntrack ip_tables xfs_quota(U) xfs(U) dm_mirror dm_round_robin
dm_multipath button battery ac ohci_hcd shpchp tg3 e1000 bonding(U)
floppy qla2322 st ext3 jbd dm_mod qla2400 qla2xxx scsi_transport_fc
sata_nv libata sd_mod scsi_mod 
Nov  7 12:50:31 houla0  Pid: 31847, comm: rsync Not tainted
2.6.9-42.0.2.ELsmp 
Nov  7 12:50:31 houla0  RIP: 0010:[<ffffffff80309ec6>]
<ffffffff80309ec6>{schedule+2164} 
Nov  7 12:50:31 houla0  RSP: 0018:00000102b89283d8  EFLAGS: 00010016 
Nov  7 12:50:31 houla0  RAX: 000000119653b000 RBX: 0000010001067b00 RCX:
0000000000000080 
Nov  7 12:50:31 houla0  RDX: ffffffff8051ad60 RSI: 00000100dffca7f0 RDI:
0000010105329030 
Nov  7 12:50:31 houla0  RBP: 00000102b8928498 R08: ffffffff80467770 R09:
0000000000000008 
Nov  7 12:50:31 houla0  R10: 0000000000000080 R11: 0000000000000008 R12:
0000000000000246 
Nov  7 12:50:31 houla0  R13: 0000010105329030 R14: 00000100010667e0 R15:
0000010037d67fc0 
Nov  7 12:50:31 houla0  FS:  0000002a955763a0(0000) GS:ffffffff804e5280
(0000) knlGS:00000000080e81c0 
Nov  7 12:50:31 houla0  CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b 
Nov  7 12:50:31 houla0  CR2: 0000001116a20188 CR3: 00000002188b2000 CR4:
00000000000006e0 
Nov  7 12:50:32 houla0  Process rsync (pid: 31847, threadinfo
00000102b8928000, task 0000010105329030) 
Nov  7 12:50:32 houla0  Stack: 0000000000000212 ffffffffa004f448
0000010001282548 00000100012803c8  
Nov  7 12:50:32 houla0         0000010338438e80 00000002dfcce000
0000010105329030 000000000055772a  
Nov  7 12:50:32 houla0         0002113f79759888 00000100dffca7f0  
Nov  7 12:50:32 houla0  Call
Trace:<ffffffffa004f448>{:qla2xxx:qla2x00_next+422}
<ffffffff8024cc75>{elv_next_request+238}  
Nov  7 12:50:32 houla0         <ffffffff803094ab>{__down+147}
<ffffffff80133da9>{default_wake_function+0}  
Nov  7 12:50:32 houla0         <ffffffff8030af43>{__down_failed+53}
<ffffffffa02133d3>{:xfs:.text.lock.xfs_buf+5}  
Nov  7 12:50:32 houla0         <ffffffffa0212a1d>{:xfs:pagebuf_iostart
+134} <ffffffffa0212a61>{:xfs:xfs_buf_read_flags+64}  
Nov  7 12:50:32 houla0
<ffffffffa01fff4b>{:xfs:xfs_trans_read_buf+428}
<ffffffffa01d7d14>{:xfs:xfs_btree_read_bufs+69}  
Nov  7 12:50:32 houla0
<ffffffffa01d7856>{:xfs:xfs_btree_check_sblock+76}  
Nov  7 12:50:32 houla0         <ffffffffa01c3ede>{:xfs:xfs_alloc_lookup
+249} <ffffffffa01c1fc0>{:xfs:xfs_alloc_ag_vextent+1173}  
Nov  7 12:50:32 houla0         <ffffffffa01c305b>{:xfs:xfs_alloc_vextent
+364} <ffffffffa01d5b75>{:xfs:xfs_bmbt_insert+985}  
Nov  7 12:50:32 houla0
<ffffffffa0200024>{:xfs:xfs_trans_read_buf+645}
<ffffffffa01cdd26>{:xfs:xfs_bmap_add_extent+2454}  
Nov  7 12:50:32 houla0
<ffffffffa01d7a62>{:xfs:xfs_btree_init_cursor+59}
<ffffffffa01d162d>{:xfs:xfs_bmapi+6589}  
Nov  7 12:50:32 houla0
<ffffffffa01cebf5>{:xfs:xfs_bmap_search_extents+92}  
Nov  7 12:50:32 houla0
<ffffffffa020a03f>{:xfs:xfs_iomap_write_allocate+582}  
Nov  7 12:50:32 houla0         <ffffffffa020933a>{:xfs:xfs_iomap+718}
<ffffffffa020a5f6>{:xfs:xfs_map_blocks+50}  
Nov  7 12:50:32 houla0
<ffffffffa020b112>{:xfs:xfs_page_state_convert+694}  
Nov  7 12:50:32 houla0
<ffffffffa00aba25>{:dm_mod:dm_any_congested+56}
<ffffffffa00ad670>{:dm_mod:dm_table_any_congested+68}  
Nov  7 12:50:32 houla0
<ffffffffa00aba25>{:dm_mod:dm_any_congested+56}
<ffffffffa020b8ce>{:xfs:linvfs_writepage+167}  
Nov  7 12:50:33 houla0         <ffffffff80165353>{shrink_zone+3095}
<ffffffff8012065d>{flush_gart+210}  
Nov  7 12:50:33 houla0         <ffffffff8011e884>{flat_send_IPI_mask+0}
<ffffffff8016593d>{try_to_free_pages+303}  
Nov  7 12:50:33 houla0         <ffffffff8013fdf3>{del_timer+107}
<ffffffff8015dfb7>{__alloc_pages+527}  
Nov  7 12:50:33 houla0         <ffffffff801406de>{process_timeout+0}
<ffffffff8015e141>{__get_free_pages+11}  
Nov  7 12:50:33 houla0         <ffffffff8016127c>{kmem_getpages+36}
<ffffffff80161a11>{cache_alloc_refill+609}  
Nov  7 12:50:33 houla0         <ffffffff801616df>{__kmalloc+123}
<ffffffffa0213497>{:xfs:kmem_alloc+91}  
Nov  7 12:50:33 houla0         <ffffffffa01ef63b>{:xfs:xfs_iread_extents
+139} <ffffffffa01cffda>{:xfs:xfs_bmapi+874}  
Nov  7 12:50:33 houla0         <ffffffff80255cda>{cfq_add_crq_rb+128}
<ffffffff8024cad8>{__elv_add_request+65}  
Nov  7 12:50:33 houla0         <ffffffff8024f7dd>{__make_request+1351}
<ffffffffa00ab30c>{:dm_mod:__map_bio+66}  
Nov  7 12:50:33 houla0         <ffffffffa00ab7ef>{:dm_mod:__split_bio
+1026} <ffffffff8015cb38>{mempool_alloc+129}  
Nov  7 12:50:33 houla0         <ffffffff801ea821>{__up_read+16}
<ffffffffa00ab98a>{:dm_mod:dm_request+396}  
Nov  7 12:50:33 houla0         <ffffffff8024f962>{generic_make_request
+355} <ffffffff8030adcc>{__down_write+52}  
Nov  7 12:50:33 houla0         <ffffffffa020923a>{:xfs:xfs_iomap+462}
<ffffffff801ea821>{__up_read+16}  
Nov  7 12:50:33 houla0
<ffffffffa020b51c>{:xfs:__linvfs_get_block+145}
<ffffffffa00ab7ef>{:dm_mod:__split_bio+1026}  
Nov  7 12:50:33 houla0         <ffffffffa020b663>{:xfs:linvfs_get_block
+20} <ffffffff80198b43>{do_mpage_readpage+213}  
Nov  7 12:50:33 houla0         <ffffffffa020b64f>{:xfs:linvfs_get_block
+0} <ffffffff8012065d>{flush_gart+210}  
Nov  7 12:50:33 houla0         <ffffffff801e98e7>{radix_tree_node_alloc
+19} <ffffffff801e9aa3>{radix_tree_insert+254}  
Nov  7 12:50:33 houla0         <ffffffffa020b64f>{:xfs:linvfs_get_block
+0} <ffffffffa020b64f>{:xfs:linvfs_get_block+0}  
Nov  7 12:50:33 houla0         <ffffffff80198e8b>{mpage_readpages+163}
<ffffffff801609f0>{read_pages+57}  
Nov  7 12:50:34 houla0
<ffffffff80160daa>{do_page_cache_readahead+319}
<ffffffff80160f6b>{page_cache_readahead+404}  
Nov  7 12:50:34 houla0         <ffffffff8015a970>{file_read_actor+0}
<ffffffff8015a69d>{do_generic_mapping_read+292}  
Nov  7 12:50:34 houla0         <ffffffff8015a970>{file_read_actor+0}
<ffffffff8015c5c7>{__generic_file_aio_read+385}  
Nov  7 12:50:34 houla0         <ffffffffa020eaf0>{:xfs:xfs_read+547}
<ffffffffa020ba18>{:xfs:linvfs_aio_read+96}  
Nov  7 12:50:34 houla0         <ffffffff801793e4>{do_sync_read+173}
<ffffffff801ea821>{__up_read+16}  
Nov  7 12:50:34 houla0
<ffffffff80135752>{autoremove_wake_function+0}
<ffffffff801794df>{vfs_read+207}  
Nov  7 12:50:34 houla0         <ffffffff80179736>{sys_read+69}
<ffffffff8011026a>{system_call+126}  
Nov  7 12:50:34 houla0          
Nov  7 12:50:34 houla0   
Nov  7 12:50:34 houla0  Code: 48 03 90 88 51 4e 80 48 ff 42 08 48 8b 47
50 48 2b 85 78 ff  
Nov  7 12:50:34 houla0  RIP <ffffffff80309ec6>{schedule+2164} RSP
<00000102b89283d8> 
Nov  7 12:50:34 houla0  CR2: 0000001116a20188 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos/attachments/20061108/ec0f7e29/attachment-0004.html>