[CentOS] Help needed with NFS issue
Giovanni Tirloni
gtirloni at sysdroid.com
Thu Apr 19 13:22:18 UTC 2012
Jumping late on this thread, pardon my ignorance of some details...
On Wed, Apr 18, 2012 at 4:35 PM, Steve Thompson <smt at vgersoft.com> wrote:
> Interesting. It looks like some kind of RPC failure. During the hang, I
> cannot contact the nfs service via RPC:
>
> # rpcinfo -t <server> nfs
> rpcinfo: RPC: Timed out
> program 100003 version 0 is not available
>
Did you run this command during "the hang" or is it constantly returning
you that?
If the later, are you blocking UDP on either the server or the client?
> # rpcinfo -p <server>
> program vers proto port
> 100000 2 tcp 111 portmapper
> 100000 2 udp 111 portmapper
> 100024 1 udp 1007 status
> 100024 1 tcp 1010 status
> 100021 1 udp 35077 nlockmgr
> 100021 3 udp 35077 nlockmgr
> 100021 4 udp 35077 nlockmgr
> 100021 1 tcp 56622 nlockmgr
> 100021 3 tcp 56622 nlockmgr
> 100021 4 tcp 56622 nlockmgr
> 100011 1 udp 1009 rquotad
> 100011 2 udp 1009 rquotad
> 100011 1 tcp 1012 rquotad
> 100011 2 tcp 1012 rquotad
> 100003 2 udp 2049 nfs
> 100003 3 udp 2049 nfs
> 100003 4 udp 2049 nfs
> 100003 2 tcp 2049 nfs
> 100003 3 tcp 2049 nfs
> 100003 4 tcp 2049 nfs
> 100005 1 udp 605 mountd
> 100005 1 tcp 608 mountd
> 100005 2 udp 605 mountd
> 100005 2 tcp 608 mountd
> 100005 3 udp 605 mountd
> 100005 3 tcp 608 mountd
>
> However, I can connect to the service via telnet:
>
> # telnet <server> nfs
> Trying <ipaddr>...
> Connected to <server> (<ipaddr>).
> Escape character is '^]'.
>
If you don't specify transport protocol, rpcinfo will use whatever is
defined in the /etc/netconfig database and that's usually UDP.
A couple of ideas/questions:
- Is it happening at the exact same minute (eg. 2:15, 2:45, 3:15, 3:45).
This might help you to identify a script/program that follows that schedule.
- Is there any configuration different between this server and the others?
/etc/system, root crontab, etc.
- When you say everything else BUT NFS is working fine, are pings answered
properly without increased latency during "the hang" ?
- What about other services? Can you set up a monitoring script connecting
to some other service (eg. ftp, ls, exit or ssh) and reporting the total
run time?
- Can you set up a monitoring script running "rpcinfo" on localhost to make
sure both local and remote communications hang?
--
Giovanni
More information about the CentOS
mailing list