UPDATE
I rolled a new kernel that's identical to the stock CentOS 2.6.32-220.el6 kernel with the exception of the new idmapper being enabled. Unfortunately there's been no improvement.
Did you get a chance to try the RHEL kernel?
-Aaron
On Fri, Mar 16, 2012 at 7:01 PM, Ray Van Dolson rayvd@bludgeon.org wrote:
On Fri, Mar 16, 2012 at 01:33:54PM -0700, Aaron Blew wrote:
Hello all, I'm currently experiencing an issue with an NFS server I've built (a Dell R710 with a Dell PERC H800/LSI 2108 and four external disk trays). It's
a
backup target for Solaris 10, CentOS 5.5 and CentOS 6.2 servers that
mount
it's data volume via NFS. It has two 10gig NICs set up in a layer2+3
bond
for one network, and two more 10gig NICs set up in the same way in
another
network. The host has a 99T XFS filesystem for the backups.
RPCNFSDCOUNT
is set to 256.
During backups from clients the system exhibits odd hangs that interfere with some of our sensitive system's backup windows. On the NFS server
side
we see the following in dmesg. Originally I thought it was related to dirty writeback cache, but I adjusted dirty_writeback_centisecs and am still seeing the issue.
dmesg during the problem window: Mar 16 07:01:21 *****store01 kernel: __ratelimit: 11 callbacks suppressed Mar 16 07:01:21 *****store01 kernel: nfsd: page allocation failure.
<snip>
Has anyone else seem similar issues? I can provide additional details about the server/configuration if anybody needs anything else. The issue only seems to occur under high write load as we've restored some of these backups and didn't seem to have an issue reading the data.
The page allocation failure message made me wonder if your issue could be related to the issue I've run into here[1] on RHEL 6.2.
My issue seems to be related to NFS mounting, but it's possible the root cause could be the same?
A few other links:
https://bugzilla.redhat.com/show_bug.cgi?id=593035 http://www.spinics.net/lists/linux-nfs/msg22248.html
Red Hat has provided me with a test kernel which purportedly will resolve the issue. I haven't had a chance to test it out yet.
Ray
[1] https://bugzilla.redhat.com/show_bug.cgi?id=751992 _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos