[CentOS] Help needed with NFS issue

Thu Apr 19 13:36:01 UTC 2012
Steve Thompson <smt at vgersoft.com>

On Thu, 19 Apr 2012, Giovanni Tirloni wrote:

> Did you run this command during "the hang" or is it constantly returning
> you that?

It is returning the time out only during the hang; the rest of the time 
it works normally.

> If the later, are you blocking UDP on either the server or the client?

No blocking.

> If you don't specify transport protocol, rpcinfo will use whatever is
> defined in the /etc/netconfig database and that's usually UDP.

Using UDP or TCP makes no difference. "rpcinfo -{u,t} host nfs" both give 
a timeout during the hang, and work normally during other times.

> - Is it happening at the exact same minute (eg. 2:15, 2:45, 3:15, 3:45).
> This might help you to identify a script/program that follows that schedule.

It is not related to any script that I can find. It is not happening at 
_exactly_ the same time all the time, although it is similar within a few 
minutes.

> - Is there any configuration different between this server and the others?
> /etc/system, root crontab, etc.

No differences that I can find.

> - When you say everything else BUT NFS is working fine, are pings answered
> properly without increased latency during "the hang" ?

Yes. I can even run an iperf server on the host during the hang, and from
a client I run iperf -c and get normal performance.

> - What about other services? Can you set up a monitoring script connecting
> to some other service (eg. ftp, ls, exit or ssh) and reporting the total
> run time?

No other service appears to be impacted at all.

> - Can you set up a monitoring script running "rpcinfo" on localhost to make
> sure both local and remote communications hang?

Yes, can do.

-Steve