Hi, I was hoping someone would have an idea of what's going on here...
We have two NFS issues. One of which is certainly centos based, one of which we're not sure of.
First issue is: As of Centos 5, we can't make simultaneous access to a directory via NFS. To duplicate, I cd into a share in two windows, copy a 1G file in the first window, and just do an ls in the other. The ls will hang until the write is done.
Turning off apic seems to help a little, but there's still a very significant hang.
This problem is not apparent in centos 4.x, even when mounting to the same NFS server.
The other problem is - we're seeing tremendous slowdowns when going through an Acopia NFS virtualization server. These slowdowns got much more severe when we moved to Centos 5.2. If we connect to the NFS appliance directly, these slowdowns don't exist.
Has anyone seen problems like this? How did you solve them?
Thanks.
--Russell
On Mon, Dec 1, 2008 at 10:32 AM, Russell Miller duskglow@gmail.com wrote:
Hi, I was hoping someone would have an idea of what's going on here...
We have two NFS issues. One of which is certainly centos based, one of which we're not sure of.
First issue is: As of Centos 5, we can't make simultaneous access to a directory via NFS. To duplicate, I cd into a share in two windows, copy a 1G file in the first window, and just do an ls in the other. The ls will hang until the write is done.
Turning off apic seems to help a little, but there's still a very significant hang.
This problem is not apparent in centos 4.x, even when mounting to the same NFS server.
The other problem is - we're seeing tremendous slowdowns when going through an Acopia NFS virtualization server. These slowdowns got much more severe when we moved to Centos 5.2. If we connect to the NFS appliance directly, these slowdowns don't exist.
Has anyone seen problems like this? How did you solve them?
What is the version of your kernel? There are (used to be) known issues with NFS in certain versions of the CentOS-5 kernels.
Akemi
All 5.2 versions have this problem.
--Russell
On Mon, Dec 1, 2008 at 11:03 AM, Akemi Yagi amyagi@gmail.com wrote:
On Mon, Dec 1, 2008 at 10:32 AM, Russell Miller duskglow@gmail.com wrote:
Hi, I was hoping someone would have an idea of what's going on here...
We have two NFS issues. One of which is certainly centos based, one of which we're not sure of.
First issue is: As of Centos 5, we can't make simultaneous access to a directory via NFS. To duplicate, I cd into a share in two windows, copy
a
1G file in the first window, and just do an ls in the other. The ls will hang until the write is done.
Turning off apic seems to help a little, but there's still a very significant hang.
This problem is not apparent in centos 4.x, even when mounting to the
same
NFS server.
The other problem is - we're seeing tremendous slowdowns when going
through
an Acopia NFS virtualization server. These slowdowns got much more
severe
when we moved to Centos 5.2. If we connect to the NFS appliance
directly,
these slowdowns don't exist.
Has anyone seen problems like this? How did you solve them?
What is the version of your kernel? There are (used to be) known issues with NFS in certain versions of the CentOS-5 kernels.
Akemi _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Mon, Dec 1, 2008 at 11:10 AM, Russell Miller duskglow@gmail.com wrote:
All 5.2 versions have this problem.
--Russell
You might want to look into upstream bugzilla reports:
https://bugzilla.redhat.com/show_bug.cgi?id=436004
and
https://bugzilla.redhat.com/show_bug.cgi?id=448130
and see if your issue is related to any of them. If it looks like it, try a workaround or a test kernel offered in there.
Akemi
On Mon, Dec 1, 2008 at 11:27 AM, Akemi Yagi amyagi@gmail.com wrote:
On Mon, Dec 1, 2008 at 11:10 AM, Russell Miller duskglow@gmail.com wrote:
All 5.2 versions have this problem.
--Russell
You might want to look into upstream bugzilla reports:
https://bugzilla.redhat.com/show_bug.cgi?id=436004
and
Thanks, I'm reading them now.
--Russell
Russell Miller wrote:
Hello.
directory via NFS. To duplicate, I cd into a share in two windows, copy a 1G file in the first window, and just do an ls in the other. The ls will hang until the write is done. Has anyone seen problems like this? How did you solve them?
Your nfs settings could be helpful. I am using nfs4 with no such problems.
Example settings from my /etc/exports: /exports/<DIR> 192.168.0.0/2 (ro,nohide,insecure,no_subtree_check,async)
Example settings from my /etc/auto.* (automount): * -fstype=nfs4,rw,tcp,port=2049,soft,intr,rsize=8192,wsize=8192,nosuid <SERVER>:/<PATH>/&
Maybe the options *async* and the settings for rsize and wsize could be helpful for you?
regards Olaf
Example settings from my /etc/auto.* (automount):
-fstype=nfs4,rw,tcp,port=2049,soft,intr,rsize=8192,wsize=8192,nosuid <SERVER>:/<PATH>/&
Maybe the options *async* and the settings for rsize and wsize could be helpful for you?
We're using nfs3 over tcp. rsize and wsize are 32768. Async is default, though I've tried sync.
Funny thing is, turning on nfs debug and trying to trigger this problem seems to cause data corruption. Once it even managed to corrupt the local disk writes to the points where the journals aborted and I had to reboot.
--Russell
Russell Miller wrote:
Hello.
We're using nfs3 over tcp. rsize and wsize are 32768. Async is default, though I've tried sync.
Funny thing is, turning on nfs debug and trying to trigger this problem seems to cause data corruption. Once it even managed to corrupt the local disk writes to the points where the journals aborted and I had to reboot.
Is portmap installed on server and client?
regards Olaf
On Mon, Dec 1, 2008 at 11:48 AM, Olaf Mueller daily-planet@istari.dewrote:
Funny thing is, turning on nfs debug and trying to trigger this problem seems to cause data corruption. Once it even managed to corrupt the local disk writes to the points where the journals aborted and I had to reboot.
Is portmap installed on server and client?
On the client, yes. On the server, I don't know, as these are Acopia virtualization servers and onstor/bluearc filers.
--Russell
Russell Miller wrote:
Hello.
We're using nfs3 over tcp. rsize and wsize are 32768. Async is default, though I've tried sync.
Try rsize=8192 and wsize=8192. And my settings for /etc/hosts.allow, maybe helpful?
portmap: 127.0.0. 192.168.0. lockd: 127.0.0. 192.168.0. rquotad: 127.0.0. 192.168.0. mountd: 127.0.0. 192.168.0. statd: 127.0.0. 192.168.0. nfsd: 127.0.0. 192.168.0.
regards Olaf
On Mon, Dec 1, 2008 at 11:51 AM, Olaf Mueller daily-planet@istari.dewrote:
Russell Miller wrote:
Hello.
We're using nfs3 over tcp. rsize and wsize are 32768. Async is default, though I've tried sync.
Try rsize=8192 and wsize=8192. And my settings for /etc/hosts.allow, maybe helpful?
I can try it. But there's a reason that we're using 32768, apparently the Acopias don't like 8192.
--Russell
Russell Miller wrote:
Hello.
We're using nfs3 over tcp. rsize and wsize are 32768. Async is default, though I've tried sync.
Try rsize=8192 and wsize=8192. And my settings for /etc/hosts.allow, maybe helpful?
I can try it. But there's a reason that we're using 32768, apparently the Acopias don't like 8192.
Ok, but the portmap service have to run on the server. Correct me if I get wrong, but the CentOS clients are asking the server for the port to use and this is what portmap does.
regards Olaf
Ok, but the portmap service have to run on the server. Correct me if I get wrong, but the CentOS clients are asking the server for the port to use and this is what portmap does.
Actually I have a little more information, but I'm having a hard time putting the pieces together. It looks like when you turn attribute caching off, the first problem goes completely away, at the expense of slowing things way down. The attribute caching appears to be blocking for some reason - if you set the timeout to 1, the ls will finish after 1 second.
--Russell