[CentOS] nfs (or tcp or scheduler) changes between centos 5 and 6?

Wed Apr 29 15:36:53 UTC 2015
Devin Reade <gdr at gno.org>

--On Wednesday, April 29, 2015 08:35:29 AM -0500 Matt Garman 
<matthew.garman at gmail.com> wrote:

> All indications are that CentOS 6 seems to be much more "aggressive"
> in how it does NFS reads.  And likewise, CentOS 5 was very "polite",
> to the point that it basically got starved out by the introduction of
> the 6.5 boxes.

Some things come to mind as far as investigating differences; you don't
have to answer them all here; just making sure you've covered them all:

Have you looked at the client-side NFS cache?  Perhaps the C6 cache
is either disabled, has fewer resources, or is invalidating faster?
(I don't think that would explain the C5 starvation, though, unless
it's a secondary effect from retransmits, etc.)

Regarding the cache, do you have multiple mount points on a client
that resolve to the same server filesystem?  If so, do they have
different mount options?  If so, that can result in multiple caches
instead of a single disk cache.  The client cache can also be bypassed
if your application is doing direct I/O on the files.  Perhaps there
is a difference in the application between C5 and C6, including
whether or not it was just recompiled?  (If so, can you try a C5 version
on the C6 machines?)

If you determine that C6 is doing aggressive caching, does this match
the needs of your application?  That is, do you have the situation
where the client NFS layer does an aggressive read-ahead that is never
used by the application?

Are C5 and C6 using the same NFS protocol version?  How about TCP vs
UDP?  If UDP is in play, have a look at fragmentation stats under load.

Are both using the same authentication method (ie: maybe just
UID-based)?

And, like always, is DNS sane for all your clients and servers?  Everything
(including clients) has proper PTR records, consistent with A records,
et al?  DNS is so fundamental to everything that if it is out of whack
you can get far-reaching symptoms that don't seem to have anything to do
with DNS.

<http://wiki.linux-nfs.org> has helpful information about enabling debug
output on the client end to see what is going on.  I don't know in your
situation if enabling server-side debugging is feasible.
<http://nfs.sourceforge.net> also has useful tuning information.

You may want to look at NFSometer and see if it can help.

Devin