Our large and complex build system is having very sporadic failurs as we try to update to CentOS 4.5. This takes the form of files that exist - and have existed for some time - not being found:
file.whatever: No such file: No such file or directory
This happens both with source files, .o's - meaning that the errors come from both the compiler and linker.
The build storage is on NFS and the errors don't happen when we target local disk. We also have solaris nad Centos 3.x building the same code base in a similar fashion with no problem.
We are currently using mount options of: rw,nosuid,bg,timeo=50,retry=1 And have also tried: rw,intr,bg,proto=tcp,nfsvers=3,rsize=32768,wsize=32768 without success.
Any suggestions welcome.
-Mark
Mark Belanger wrote:
We are currently using mount options of: rw,nosuid,bg,timeo=50,retry=1 And have also tried: rw,intr,bg,proto=tcp,nfsvers=3,rsize=32768,wsize=32768 without success.
This might be a silly question, but does /proc/mounts show the "hard" option? You need this unless you like getting R/W errors when you get a lot of traffic. I can't remember what the default for CentOS 4.5 is.
Jeremy
Jeremy Sanders wrote:
Mark Belanger wrote:
We are currently using mount options of: rw,nosuid,bg,timeo=50,retry=1 And have also tried: rw,intr,bg,proto=tcp,nfsvers=3,rsize=32768,wsize=32768 without success.
This might be a silly question, but does /proc/mounts show the "hard" option? You need this unless you like getting R/W errors when you get a lot of traffic. I can't remember what the default for CentOS 4.5 is
Yes. It shows: rw,nosuid,v3,rsize=8192,wsize=8192,hard,tcp,lock,proto=tcp,timeo=50,retrans=5,addr=myhost 0 0
This really seems like a client side issue. The targeted NFS dirs range from CentOS 3.x hosts to NetApp filers. All exhibit the same problem and all work fine when our builds are run on CentOS3.x or Solaris
-Mark
On Thu, 10 Jan 2008, Mark Belanger wrote:
This really seems like a client side issue. The targeted NFS dirs range from CentOS 3.x hosts to NetApp filers. All exhibit the same problem and all work fine when our builds are run on CentOS3.x or Solaris
I have also regularly seen this issue and agree that it appears to be client-side. It has occurred for every version of Linux NFS server that I have used, from RH 5.0 up to CentOS 5.1. I also do not see it for any non-Linux client (Tru64, OS X, Solaris). I also have not seen it for CentOS 3/4/5 64-bit clients, but I do see it for CentOS 3 32-bit clients.
Steve
Steve Thompson wrote:
On Thu, 10 Jan 2008, Mark Belanger wrote:
This really seems like a client side issue. The targeted NFS dirs range from CentOS 3.x hosts to NetApp filers. All exhibit the same problem and all work fine when our builds are run on CentOS3.x or Solaris
I have also regularly seen this issue and agree that it appears to be client-side. It has occurred for every version of Linux NFS server that I have used, from RH 5.0 up to CentOS 5.1. I also do not see it for any non-Linux client (Tru64, OS X, Solaris). I also have not seen it for CentOS 3/4/5 64-bit clients, but I do see it for CentOS 3 32-bit clients.
I have no problems on 3.x - and we use NFS on a massive scale. We did have to set some mount options to get solid performance.
By comparing /proc/mounts I saw that my CentOS 4.5 machine was using tcp and proto=tcp in it's mount options - whereas the CentOS 3.5 machines are using udp,proto=udp. I have made that change and it "seems" to have solved the problem. I'll report back later if this continues.
In the meantime, I welcome any thoughts on Linux NFS and offer my current mount options for CentOS 3.5: -rw,bg,nosuid,timeo=50,retry=1 the output of /proc/mounts for a typical nfs volume is: rw,v3,rsize=8192,wsize=8192,hard,udp,lock
On CentOS 4 - this seems to work: -rw,bg,nosuid,timeo=50,retry=1,udp,proto=udp the output of /proc/mounts for a typical nfs volume is: nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,udp,lock,proto=udp,timeo=50,retrans=5
-Mark