[CentOS] CentOS7 and NFS

Fri May 15 07:26:20 UTC 2020
Patrick Bégou <Patrick.Begou at legi.grenoble-inp.fr>

Le 13/05/2020 à 15:36, Patrick Bégou a écrit :
> Le 13/05/2020 à 07:32, Simon Matter via CentOS a écrit :
>>> Le 12/05/2020 à 16:10, James Pearson a écrit :
>>>> Patrick Bégou wrote:
>>>>> Hi,
>>>>>
>>>>> I need some help with NFSv4 setup/tuning. I have a dedicated nfs server
>>>>> (2 x E5-2620  8cores/16 threads each, 64GB RAM, 1x10Gb ethernet and 16x
>>>>> 8TB HDD) used by two servers and a small cluster (400 cores). All the
>>>>> servers are running CentOS 7, the cluster is running CentOS6.
>>>>>
>>>>> Time to time on the server I get:
>>>>>
>>>>>       kernel: NFSD: client xxx.xxx.xxx.xxx testing state ID with
>>>>>      incorrect client ID
>>>>>
>>>>> And the client xxx.xxx.xxx.xxx freeze whith:
>>>>>
>>>>>       kernel: nfs: server xxxxx.legi.grenoble-inp.fr not responding,
>>>>>      still trying
>>>>>       kernel: nfs: server xxxxx.legi.grenoble-inp.fr OK
>>>>>       kernel: nfs: server xxxxx.legi.grenoble-inp.fr not responding,
>>>>>      still trying
>>>>>       kernel: nfs: server xxxxx.legi.grenoble-inp.fr OK
>>>>>
>>>>> There is a discussion on RedHat7 support about this but only open to
>>>>> subscribers. Other searches with google do not provide  useful
>>>>> information.
>>>>>
>>>>> Do you have an idea how to solve these freeze states ?
>>>>>
>>>>> More generally I would be really interested with some advice/tutorials
>>>>> to improve NFS performances in this dedicated context. There are so
>>>>> many
>>>>> [different] things about tuning NFS available on the web that I'm a
>>>>> little bit lost (the opposite of the previous question). So if some one
>>>>> has "the tutorial"...;-)
>>>> How many nfsd threads are you running on the server? - current count
>>>> will be in /proc/fs/nfsd/threads
>>>>
>>>> James Pearson
>>> Hi James,
>>>
>>> Thanks for your answer. I've configured 24 threads (for 16 hardware
>>> cores/ 32Threads on the NFS server with this processors)
>>>
>>> But it seams that there are buffer setup to modify too when increasing
>>> the threads number... It is not done.
>>>
>>> Load average on the server is below 1....
>> I'd be very careful with higher thread numbers than physical cores. NFS
>> threads and so called CPU hyper/simultaneous threads are quite different
>> things and it can hurt performance if not configured correctly.
>>
> So you suggest to limit the setup to 16 daemons ? I'll try this evening.
>
Setting 16 daemons (the number of physical cores) do not solve this
problem. Moreover I saw a document (but old) provided by DELL to
optimize NFS servers performances in HPC context and they suggest to
use... 128 daemons on a dedicated poweredge server. :-\

I saw that it is always the same client showing the problem (a large fat
node), may be I must investigate on the client side more than on the
serveur side.

Patrick