[CentOS] NFSv4 hangs on file open

Pawel Salek pawsa-gpa at theochem.kth.se
Tue Jun 12 09:13:28 UTC 2007


I have a relatively loaded CentOS5 server (64-bit, dual core) and a mixed  
bag of Fedora 6/CentOS4, 32-, and 64-bit clients. NFSv3 works without  
problem. References over NFSv4 hang occasionally, in particular on file  
opening. I wonder whether there is anybody here who can help to trace it,  
or can suggest a more appropriate forum?

The one hang that I have been able to trace involved only  
CentOS5/2.6.18-8.1.4.el5. The server is 64-bit, client: 32-bit. Hang  
happened when a program was about to be executed from NFSv4 share.  
LD_LIBRARY_PATH included a directory on this share. gdb backtrace  
revealed that the process was being loaded in memory, and an attempt to  
open (non-existing) library file never completed.

strace -p 19289
Process 19289 attached - interrupt to quit
open("/pkg/pgi/5.2-4//linux86/5.2/lib/libg2c.so.0", O_RDONLY <unfinished  

gdb program.x 19289

0x0063cb04 in open () from /lib/ld-linux.so.2
(gdb) where
0  0x0063cb04 in open () from /lib/ld-linux.so.2
1  0x0062d6c5 in open_verify () from /lib/ld-linux.so.2
2  0x0062dc6a in open_path () from /lib/ld-linux.so.2
3  0x0063055f in _dl_map_object () from /lib/ld-linux.so.2
4  0x006340d6 in openaux () from /lib/ld-linux.so.2
5  0x00635b46 in _dl_catch_error () from /lib/ld-linux.so.2
6  0x0063469a in _dl_map_object_deps () from /lib/ld-linux.so.2
7  0x0062b40e in dl_main () from /lib/ld-linux.so.2
8  0x0063b8bb in _dl_sysdep_start () from /lib/ld-linux.so.2
9  0x006292b8 in _dl_start () from /lib/ld-linux.so.2
10 0x00628817 in _start () from /lib/ld-linux.so.2

The NFSv4 is automounted (direct mount):

cat /etc/auto.pkg
/pkg    -fstype=nfs4    server:/i32

For what is worth, I tried NFSv4 with CentOS4 server but it was hopeless  
(server would stop responding or panic). Older kernel releases of Fedora6  
(2.6.19?) were hopeless too, with similar symptoms.

Anybody knows who may be interested in some detailed bug report, or help  
debugging the problem?

The server logs plenty of messages:
NFSD: setclientid: string in use by client(clientid 46604ac4/00000016)
but my rpc.idmapd configuration is correct as far as I can tell..

More information about the CentOS mailing list