On Mon, Jun 21, 2010 at 4:38 PM, Nataraj incoming-centos@rjl.com wrote:
[...]
Well, it's been a long time since I've done troubleshooting on large NFS networks, but here's an idea...
Are you seeing any kind of packet loss/retransmissions? Take a look at netstat -s. When I last did this work it was with NFS over udp, but I think retransmitted packets will cause more performance loss with large packet sizes. I used to find machines with broken ethernet interfaces that would cause these kinds of problems.
Nataraj
Thanks guys for the feedback. I've done more tests : There are very very few retransmits (less than 0,01%) so I don't think that's what happening. The client still seems to be "waiting" for something between requests, very strange.
On some servers this behavior returned despite rsize being set to 32k, I had to set it to 8k to get reasonnable throughput. So there's definitly something fishy going on. This has been reported on over 20 machines, so I don't think it's faulty hardware we're seeing.
Any thoughts, ideas on how to debug this ?
Best,
Alex