"James A. Peltier" jpeltier@sfu.ca writes:
| I am having an issue with an XFS filesystem shutting down under high | load with very many small files. Basically, I have around 3.5 - 4 | million files on this filesystem. New files are being written to the | FS all the time, until I get to 9-11 mln small files (35k on | average). | at some point I get the following in dmesg: | [2870477.695512] Filesystem "sda5": XFS internal error | xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c. Caller | 0xffffffff8826bb7d [2870477.695558] [2870477.695559] Call Trace: | [2870477.695611] [<ffffffff88262c28>] | :xfs:xfs_trans_cancel+0x5b/0xfe [2870477.695643] | [<ffffffff8826bb7d>] :xfs:xfs_mkdir+0x57c/0x5d7 [2870477.695673] | [<ffffffff8822f3f8>] :xfs:xfs_attr_get+0xbf/0xd2 [2870477.695707] | [<ffffffff88273326>] :xfs:xfs_vn_mknod+0x1e1/0x3bb [2870477.695726] | [<ffffffff80264929>] _spin_lock_irqsave+0x9/0x14 [2870477.695736] | [<ffffffff802230e6>] __up_read+0x19/0x7f [2870477.695764] | [<ffffffff8824f8f4>] :xfs:xfs_iunlock+0x57/0x79 [2870477.695776] | [<ffffffff80264929>] _spin_lock_irqsave+0x9/0x14 [2870477.695784] | [<ffffffff802230e6>] __up_read+0x19/0x7f [2870477.695791] | [<ffffffff80209f4c>] __d_lookup+0xb0/0xff [2870477.695803] | [<ffffffff8020cd4a>] _atomic_dec_and_lock+0x39/0x57 | [2870477.695814] [<ffffffff8022d6db>] mntput_no_expire+0x19/0x89 | [2870477.695829] [<ffffffff80264929>] _spin_lock_irqsave+0x9/0x14 | [2870477.695837] [<ffffffff802230e6>] __up_read+0x19/0x7f | [2870477.695861] [<ffffffff8824f8f4>] :xfs:xfs_iunlock+0x57/0x79 | [2870477.695887] [<ffffffff882680af>] :xfs:xfs_access+0x3d/0x46 | [2870477.695899] [<ffffffff80264929>] _spin_lock_irqsave+0x9/0x14 | [2870477.695923] [<ffffffff802df4a3>] vfs_mkdir+0xe3/0x152 | [2870477.695933] [<ffffffff802dfa79>] sys_mkdirat+0xa3/0xe4 | [2870477.695953] [<ffffffff80260295>] tracesys+0x47/0xb6 | [2870477.695963] [<ffffffff802602f9>] tracesys+0xab/0xb6 | [2870477.695977] [2870477.695985] xfs_force_shutdown(sda5,0x8) | called from line 1139 of file fs/xfs/xfs_trans.c. Return address = | 0xffffffff88262c46 [2870477.696452] Filesystem "sda5": Corruption of | in-memory data detected. Shutting down filesystem: sda5 | [2870477.696464] Please umount the filesystem, and rectify the | problem(s) | # ls -l /store ls: /store: Input/output error ?--------- 0 root root | 0 Jan 1 1970 /store | Filesystems is ~1T in size # df -hT /store Filesystem Type | Size Used Avail Use% Mounted on /dev/sda5 xfs 910G 142G | 769G 16% /store | Using CentOS 5.9 with kernel 2.6.18-348.el5xen | The filesystem is in a virtual machine (Xen) and on top of LVM. | Filesystem was created using mkfs.xfs defaults with | xfsprogs-2.9.4-1.el5.centos (that's the one that comes with CentOS | 5.x by default.) | | These are the defaults with which the filesystem was created: # | xfs_info /store meta-data=/dev/sda5 isize=256 | agcount=32, agsize=7454720 blks = | sectsz=512 attr=0 data = bsize=4096 | blocks=238551040, imaxpct=25 = | sunit=0 swidth=0 blks, unwritten=1 naming =version | 2 bsize=4096 log =internal | bsize=4096 blocks=32768, version=1 | = sectsz=512 sunit=0 blks, | lazy-count=0 realtime =none extsz=4096 blocks=0, | rtextents=0 | | The problem is reproducible and I don't think it's hardware related. | The problem was reproduced on multiple servers of the same type. So, | I doubt it's a memory issue or something like that. | | Is that a known issue? If it is then what's the fix? I went through | the kernel updates for CentOS 5.10 (newer kernel), but didn't see | any xfs related fixes since CentOS 5.9 | | Any help will be greatly appreciated...
Is this filesystem mounted with the inode64 option?
No, since FS is slightly smaller than 1T in size. From my understanding inode64 would be required for XFS filesystems larger than 1T?