[CentOS] System Hang on busy NFS server

Fri Aug 23 19:26:05 UTC 2013
James A. Peltier <jpeltier at sfu.ca>

On a busy NFS server I've started receiving the following error messages and the system hangs with high load but no work being done.  The system is a Dell R510 with 12 x 3TB drives in a RAID-50 configuration.  The RAID-50 device is a full disk LVM (no partitions) and one large (36TB) data volume and the system is running CentOS 6.4 fully patched for OS and firmware.

Anyone have any hints as to what might be causing this.  It looks to me like the system is starting to swap and then failing and then XFS starts to throw a hissy fit because it can't start allocating pages for it's buffers.  Eventually then the system just deadlocks.  I'm just looking for someone to confirm my findings and let me know if this is a bug or not.

I've posted the dmesg output onto pastebin http://pastebin.com/YQbhsN6a

Any help is very much appreciated!

James A. Peltier
Manager, IT Services - Research Computing Group
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : jpeltier at sfu.ca
Website : http://www.sfu.ca/itservices

“A successful person is one who can lay a solid foundation from the bricks others have thrown at them.” -David Brinkley via Luke Shaw