[CentOS] gfs2 and quotas - system crash

Mon Mar 10 17:49:59 UTC 2014
Digimer <lists at alteeve.ca>

I've not seen this before, but I am sure the cluster folks would be 
interested to see it. Can you repost/cross-post this to the 
linux-cluster mailing list?

https://www.redhat.com/mailman/listinfo/linux-cluster

digimer

On 10/03/14 01:39 PM, stephen.rankin at stfc.ac.uk wrote:
> I have tried sending this before, but it did not appear to get through.
>
>
>
> Hello,
>
>
>
> When using gfs2 with quotas on a SAN that is providing storage to two
> clustered systems running CentOS6.5, one of the systems
> can crash. This crash appears to be caused when a user tries
> to add something to a SAN disk when they have exceeded their
> quota on that disk. Sometimes a stack trace is produced in /var/log/messages
> which appears to indicate that it was gfs2 that caused the problem.
> At the same time you get the gfs2 stack trace you also see problems
> with someone exceeding their quota.
>
> The stack trace is below.
>
> Has anyone got a solution to this, other than switching of quotas? I have
> switched of quotas which appears to have stabilised the system so far, but I
> do need the quotas on.
>
> Your help is appreciated.
>
> Stephen Rankin
> STFC, RAL, ISIS
>
> Mar  5 11:40:50 chadwick kernel: GFS2: fsid=analysis:lvol0.1: quota exceeded for user 101355
> Mar  5 11:40:50 chadwick nslcd[11420]: [767df3] ldap_explode_dn(usi660) returned NULL: Success
> Mar  5 11:40:50 chadwick nslcd[11420]: [767df3] ldap_result() failed: Invalid DN syntax
> Mar  5 11:40:50 chadwick nslcd[11420]: [767df3] lookup of user usi660 failed: Invalid DN syntax
> Mar  5 11:41:46 chadwick kernel: ------------[ cut here ]------------
> Mar  5 11:41:46 chadwick kernel: WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Not tainted)
> Mar  5 11:41:46 chadwick kernel: Hardware name: PowerEdge R910
> Mar  5 11:41:46 chadwick kernel: list_add corruption. next->prev should be prev (ffff8820531518d0), but was ffff884d4c4594d0. (next=ffff884d4c4594d0).
> Mar  5 11:41:46 chadwick kernel: Modules linked in: gfs2 dlm configfs bridge autofs4 des_generic ecb md4 nls_utf8 cifs bnx2fc cnic uio fcoe libfcoe libfc 8021q garp stp llc ipv6 microcode power_meter iTCO_wdt iTCO_vendor_support dcdbas serio_raw ixgbe dca ptp pps_core mdio lpc_ich mfd_core sg ses enclosure i7core_edac edac_core bnx2 ext4 jbd2 mbcache dm_round_robin sr_mod cdrom sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
> Mar  5 11:41:46 chadwick kernel: Pid: 74823, comm: vncserver Not tainted 2.6.32-431.3.1.el6.x86_64 #1
> Mar  5 11:41:46 chadwick kernel: Call Trace:
> Mar  5 11:41:46 chadwick kernel: [<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0
> Mar  5 11:41:46 chadwick kernel: [<ffffffff81071f16>] ? warn_slowpath_fmt+0x46/0x50
> Mar  5 11:41:46 chadwick kernel: [<ffffffff812944ed>] ? __list_add+0x6d/0xa0
> Mar  5 11:41:46 chadwick kernel: [<ffffffff811a6c02>] ? new_inode+0x72/0xb0
> Mar  5 11:41:46 chadwick kernel: [<ffffffffa03f45d5>] ? gfs2_create_inode+0x1b5/0x1150 [gfs2]
> Mar  5 11:41:46 chadwick kernel: [<ffffffffa03f3986>] ? gfs2_glock_nq_init+0x16/0x40 [gfs2]
> Mar  5 11:41:46 chadwick kernel: [<ffffffffa03ffc74>] ? gfs2_mkdir+0x24/0x30 [gfs2]
> Mar  5 11:41:46 chadwick kernel: [<ffffffff8122766f>] ? security_inode_mkdir+0x1f/0x30
> Mar  5 11:41:46 chadwick kernel: [<ffffffff81198149>] ? vfs_mkdir+0xd9/0x140
> Mar  5 11:41:46 chadwick kernel: [<ffffffff8119ab67>] ? sys_mkdirat+0xc7/0x1b0
> Mar  5 11:41:46 chadwick kernel: [<ffffffff8119ac68>] ? sys_mkdir+0x18/0x20
> Mar  5 11:41:46 chadwick kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
> Mar  5 11:41:46 chadwick kernel: ---[ end trace e51734a39976a028 ]---
> Mar  5 11:41:46 chadwick kernel: GFS2: fsid=analysis:lvol0.1: quota exceeded for user 101355
> Mar  5 11:41:47 chadwick abrtd: Directory 'oops-2014-03-05-11:41:47-12194-1' creation detected
> Mar  5 11:41:47 chadwick abrt-dump-oops: Reported 1 kernel oopses to Abrt
> Mar  5 11:41:47 chadwick abrtd: Can't open file '/var/spool/abrt/oops-2014-03-05-11:41:47-12194-1/uid': No such file or directory
> Mar  5 11:41:54 chadwick kernel: GFS2: fsid=analysis:lvol0.1: quota exceeded for user 101355
>
>
>
>


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?