[CentOS] NFS performance - default rsize

Thu Jun 24 13:08:39 UTC 2010
Alex Still <alex.ranskis at gmail.com>

On Tue, Jun 22, 2010 at 6:40 PM, Ross Walker <rswwalker at gmail.com> wrote:
> On Jun 22, 2010, at 11:44 AM, Alex Still <alex.ranskis at gmail.com> wrote:
>
>> [...]
>>
>>>> On some servers this behavior returned despite rsize being set to 32k,
>>>> I had to set it to 8k to get reasonnable throughput. So there's
>>>> definitly something fishy going on. This has been reported on over 20
>>>> machines, so I don't think it's faulty hardware we're seeing.
>>>>
>>>> Any thoughts, ideas on how to debug this ?
>>>
>>> Can you explain the network environment and the connectivity between the client and server some more.
>>
>> Clients are blade servers. The blade chassis have integrated cisco
>> switches, which are plugged to a cisco 6509. The NFS server is on
>> another site 40km away, directly connected to another 6509.  These
>> datacenters are linked via DWDM.
>> Latency between a client and the NFS server is about half a
>> millisecond. Jumbo frames are enabled.
>>
>> Blades have 1 Gb link
>> The NFS server has multiple 1Gb links, used for different shares
>> Neither are close to full utilization, maybe 100Mb/s of traffic and 20
>> 000 paquets/s at the server end
>
> I have seen non-standard jumbo frames cause problems in the past.
>
> Can you try unmounting shares on one client, setting the MTU to 1500, re-mount the shares and see how it works?
>
> TCP between server and client will negotiate down to client's MSS so no need to change server's MTU.

It doesn't seem to help. But we missed something during the first
tests.. there are indeed  some retransmits at the TCP level, on the
server side. These go away when we set rsize to 32k. So ,probably a
network issue we'll have to figure out.

Thanks for your feedback,

Alex