[CentOS] gfs2 and quotas - system crash

Mon Mar 10 17:39:48 UTC 2014
stephen.rankin at stfc.ac.uk <stephen.rankin at stfc.ac.uk>

I have tried sending this before, but it did not appear to get through.



Hello,



When using gfs2 with quotas on a SAN that is providing storage to two
clustered systems running CentOS6.5, one of the systems
can crash. This crash appears to be caused when a user tries
to add something to a SAN disk when they have exceeded their
quota on that disk. Sometimes a stack trace is produced in /var/log/messages
which appears to indicate that it was gfs2 that caused the problem.
At the same time you get the gfs2 stack trace you also see problems
with someone exceeding their quota.

The stack trace is below.

Has anyone got a solution to this, other than switching of quotas? I have
switched of quotas which appears to have stabilised the system so far, but I
do need the quotas on.

Your help is appreciated.

Stephen Rankin
STFC, RAL, ISIS

Mar  5 11:40:50 chadwick kernel: GFS2: fsid=analysis:lvol0.1: quota exceeded for user 101355
Mar  5 11:40:50 chadwick nslcd[11420]: [767df3] ldap_explode_dn(usi660) returned NULL: Success
Mar  5 11:40:50 chadwick nslcd[11420]: [767df3] ldap_result() failed: Invalid DN syntax
Mar  5 11:40:50 chadwick nslcd[11420]: [767df3] lookup of user usi660 failed: Invalid DN syntax
Mar  5 11:41:46 chadwick kernel: ------------[ cut here ]------------
Mar  5 11:41:46 chadwick kernel: WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Not tainted)
Mar  5 11:41:46 chadwick kernel: Hardware name: PowerEdge R910
Mar  5 11:41:46 chadwick kernel: list_add corruption. next->prev should be prev (ffff8820531518d0), but was ffff884d4c4594d0. (next=ffff884d4c4594d0).
Mar  5 11:41:46 chadwick kernel: Modules linked in: gfs2 dlm configfs bridge autofs4 des_generic ecb md4 nls_utf8 cifs bnx2fc cnic uio fcoe libfcoe libfc 8021q garp stp llc ipv6 microcode power_meter iTCO_wdt iTCO_vendor_support dcdbas serio_raw ixgbe dca ptp pps_core mdio lpc_ich mfd_core sg ses enclosure i7core_edac edac_core bnx2 ext4 jbd2 mbcache dm_round_robin sr_mod cdrom sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Mar  5 11:41:46 chadwick kernel: Pid: 74823, comm: vncserver Not tainted 2.6.32-431.3.1.el6.x86_64 #1
Mar  5 11:41:46 chadwick kernel: Call Trace:
Mar  5 11:41:46 chadwick kernel: [<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0
Mar  5 11:41:46 chadwick kernel: [<ffffffff81071f16>] ? warn_slowpath_fmt+0x46/0x50
Mar  5 11:41:46 chadwick kernel: [<ffffffff812944ed>] ? __list_add+0x6d/0xa0
Mar  5 11:41:46 chadwick kernel: [<ffffffff811a6c02>] ? new_inode+0x72/0xb0
Mar  5 11:41:46 chadwick kernel: [<ffffffffa03f45d5>] ? gfs2_create_inode+0x1b5/0x1150 [gfs2]
Mar  5 11:41:46 chadwick kernel: [<ffffffffa03f3986>] ? gfs2_glock_nq_init+0x16/0x40 [gfs2]
Mar  5 11:41:46 chadwick kernel: [<ffffffffa03ffc74>] ? gfs2_mkdir+0x24/0x30 [gfs2]
Mar  5 11:41:46 chadwick kernel: [<ffffffff8122766f>] ? security_inode_mkdir+0x1f/0x30
Mar  5 11:41:46 chadwick kernel: [<ffffffff81198149>] ? vfs_mkdir+0xd9/0x140
Mar  5 11:41:46 chadwick kernel: [<ffffffff8119ab67>] ? sys_mkdirat+0xc7/0x1b0
Mar  5 11:41:46 chadwick kernel: [<ffffffff8119ac68>] ? sys_mkdir+0x18/0x20
Mar  5 11:41:46 chadwick kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Mar  5 11:41:46 chadwick kernel: ---[ end trace e51734a39976a028 ]---
Mar  5 11:41:46 chadwick kernel: GFS2: fsid=analysis:lvol0.1: quota exceeded for user 101355
Mar  5 11:41:47 chadwick abrtd: Directory 'oops-2014-03-05-11:41:47-12194-1' creation detected
Mar  5 11:41:47 chadwick abrt-dump-oops: Reported 1 kernel oopses to Abrt
Mar  5 11:41:47 chadwick abrtd: Can't open file '/var/spool/abrt/oops-2014-03-05-11:41:47-12194-1/uid': No such file or directory
Mar  5 11:41:54 chadwick kernel: GFS2: fsid=analysis:lvol0.1: quota exceeded for user 101355




-- 
Scanned by iCritical.