Hi all!
Just a word of warning: after updating a few of our x86_64 based web frontend boxes to the new kernel, we began to get weird MySQL timeouts. The problem went away again when we downgraded to the previous kernel-2.6.18-53.1.6.el5.x86_64.rpm
regards, Bent Terp
Bent Terp wrote:
Hi all!
Just a word of warning: after updating a few of our x86_64 based web frontend boxes to the new kernel, we began to get weird MySQL timeouts. The problem went away again when we downgraded to the previous kernel-2.6.18-53.1.6.el5.x86_64.rpm
A bit more info / context would be nice !
On 1/24/08, Karanbir Singh mail-lists@karan.org wrote:
Bent Terp wrote:
Hi all!
Just a word of warning: after updating a few of our x86_64 based web frontend boxes to the new kernel, we began to get weird MySQL timeouts. The problem went away again when we downgraded to the previous kernel-2.6.18-53.1.6.el5.x86_64.rpm
A bit more info / context would be nice !
We upgraded our web front servers to kernel 2.6.18-53.1.6, and suddenly sites wouldn't load. It seemed to be that the connections from php to the backend sql servers timed out, so we immediately downgraded back to 2.6.18-53.1.4
Now that we've had more time to look at the problem, it is not related to mysql, sorry about that. Rather, it looks as if the set of nfs patches do not agree with our EMC Cellera NAS server. Backing out that bunch and rebuilding makes the problem go away.
The patches that gives us problems, results in a kernel which makes something like 2000 times more "NFS V3 LOOKUP Call" and "NFS V3 LOOKUP Reply" than without.
Has something changed with regard to the mount options? We use (rw,noatime,rsize=8192,wsize=8192,hard,udp,context="system_u:object_r:httpd_sys_content_t:s0) which has worked fine until now.
regards, Bent
Bent Terp wrote:
On 1/24/08, Karanbir Singh mail-lists@karan.org wrote:
Bent Terp wrote:
Hi all!
Just a word of warning: after updating a few of our x86_64 based web frontend boxes to the new kernel, we began to get weird MySQL timeouts. The problem went away again when we downgraded to the previous kernel-2.6.18-53.1.6.el5.x86_64.rpm
A bit more info / context would be nice !
We upgraded our web front servers to kernel 2.6.18-53.1.6, and suddenly sites wouldn't load. It seemed to be that the connections from php to the backend sql servers timed out, so we immediately downgraded back to 2.6.18-53.1.4
Now that we've had more time to look at the problem, it is not related to mysql, sorry about that. Rather, it looks as if the set of nfs patches do not agree with our EMC Cellera NAS server. Backing out that bunch and rebuilding makes the problem go away.
The patches that gives us problems, results in a kernel which makes something like 2000 times more "NFS V3 LOOKUP Call" and "NFS V3 LOOKUP Reply" than without.
Has something changed with regard to the mount options? We use (rw,noatime,rsize=8192,wsize=8192,hard,udp,context="system_u:object_r:httpd_sys_content_t:s0) which has worked fine until now.
I am trying to duplicate your options ... and noatime is not a valid option.
Could you please double check the /etc/export options again so I can try to duplicate the issue.
Using my standard /etc/exports on 2 i686 test platforms I have no problems at all.
Here are the options I used on my test:
(rw,insecure,sync,no_subtree_check)
Thanks, Johnny Hughes
On Wed, 30 Jan 2008 at 10:18am, Johnny Hughes wrote
Bent Terp wrote:
Has something changed with regard to the mount options? We use (rw,noatime,rsize=8192,wsize=8192,hard,udp,context="system_u:object_r:httpd_sys_content_t:s0) which has worked fine until now.
I am trying to duplicate your options ... and noatime is not a valid option.
Could you please double check the /etc/export options again so I can try to duplicate the issue.
Using my standard /etc/exports on 2 i686 test platforms I have no problems at all.
Here are the options I used on my test:
(rw,insecure,sync,no_subtree_check)
Those are NFS export options. The OP's list is *mount* options (i.e. on the client side). He stated that his NFS server is actually an EMC Cellera.
Joshua Baker-LePain wrote:
On Wed, 30 Jan 2008 at 10:18am, Johnny Hughes wrote
Bent Terp wrote:
Has something changed with regard to the mount options? We use (rw,noatime,rsize=8192,wsize=8192,hard,udp,context="system_u:object_r:httpd_sys_content_t:s0)
which has worked fine until now.
I am trying to duplicate your options ... and noatime is not a valid option.
Could you please double check the /etc/export options again so I can try to duplicate the issue.
Using my standard /etc/exports on 2 i686 test platforms I have no problems at all.
Here are the options I used on my test:
(rw,insecure,sync,no_subtree_check)
Those are NFS export options. The OP's list is *mount* options (i.e. on the client side). He stated that his NFS server is actually an EMC Cellera.
AH ... now I see.
In any event, I can not duplicate the problem with an nfs export on c4 or c5 and connecting with a c5 client, regardless of the kernel using i686.
On Wed, 2008-01-30 at 10:25 -0600, Johnny Hughes wrote:
Joshua Baker-LePain wrote:
On Wed, 30 Jan 2008 at 10:18am, Johnny Hughes wrote
Bent Terp wrote:
Has something changed with regard to the mount options? We use
(rw,noatime,rsize=8192,wsize=8192,hard,udp,context="system_u:object_r:httpd_sys_content_t:s0)
which has worked fine until now.
I am trying to duplicate your options ... and noatime is not a valid option.
Could you please double check the /etc/export options again so I can try to duplicate the issue.
Using my standard /etc/exports on 2 i686 test platforms I have no problems at all.
Here are the options I used on my test:
(rw,insecure,sync,no_subtree_check)
Those are NFS export options. The OP's list is *mount* options (i.e. on the client side). He stated that his NFS server is actually an EMC Cellera.
AH ... now I see.
In any event, I can not duplicate the problem with an nfs export on c4 or c5 and connecting with a c5 client, regardless of the kernel using i686.
According to man pages for mount and nfs, *atime is not a supported mount option for NFS. *If* I read correctly.
<snip sig stuff>
On Jan 30, 2008 5:39 PM, William L. Maltby CentOS4Bill@triad.rr.com wrote:
According to man pages for mount and nfs, *atime is not a supported mount option for NFS. *If* I read correctly.
I don't agree. noatime is listed in the general section of man mount, and those options should then exist (but may be ignored) by nfs. man nfs explains the differences between v3 and v4. In Documentation/filesystems there aren't any caveats either.
Anyways, we didn't change mount options when upping the kernel.
regards, Bent
On Thu, 2008-01-31 at 12:12 +0100, Bent Terp wrote:
On Jan 30, 2008 5:39 PM, William L. Maltby CentOS4Bill@triad.rr.com wrote:
According to man pages for mount and nfs, *atime is not a supported mount option for NFS. *If* I read correctly.
AAAAAAAAAAAAAAAAAAAAAA TG I said this. ||||||||||||||||||||||
I don't agree. noatime is listed in the general section of man mount, and those options should then exist (but may be ignored) by nfs. man nfs explains the differences between v3 and v4. In Documentation/filesystems there aren't any caveats either.
Well, although it's been a millennium, I seem to recall using that option for NFS mounts in the past. "His curiosity piqued, the sleuth is irresistibly drawn into a brief diversion from his pastime to see what can be discovered!".
Hmmm... from man mount
---------------- Edited for readability -------------------------------- -o Options are specified with a -o flag followed by a comma separated string of options. Some of these options are only useful when they appear in the /etc/fstab file. The following options apply to any file system that is being mounted
--->>> (but not every file system actually honors them - e.g., the sync option today has effect only for ext2, ext3, fat, vfat and ufs):
------------------------------------------------------------------------
From man nfs, following "Options for the nfs file system type" and for
the NFS4 section as well, there is no mention of (no)atime.
Johnny's attempted tests, IIRC, erred when attempting to use it, but there were other options so it is not conclusive.
Logical, but possibly erroneous conclusion: unsupported. However, I know that man pages can be incomplete, inaccurate and ambiguous. So what does one do?
$ grep -irl atime /usr/share/system-config-nfs
And as a check I didn't goofus the previous command:
$ grep -irl sync /usr/share/system-config-nfs /usr/share/system-config-nfs/propertiesWindow.py /usr/share/system-config-nfs/nfsBackend.pyc /usr/share/system-config-nfs/nfsBackend.py /usr/share/system-config-nfs/nfs-export.pyc /usr/share/system-config-nfs/propertiesWindow.pyc /usr/share/system-config-nfs/nfsData.py /usr/share/system-config-nfs/nfsData.pyc /usr/share/system-config-nfs/nfs-export.py $
Ahhh... but still, maybe it's just not included in the default setup scripts.
$ (cd /usr/share/doc/; grep -irl atime system-config-nfs* libnfsidmap*) $ (cd /usr/share/doc/; grep -irl sync system-config-nfs* libnfsidmap*) system-config-nfs-1.2.8/config.html $
Anyways, we didn't change mount options when upping the kernel.
Still, we can't believe that an option you've never changed, and apparently worked before, and is not specifically (in)excluded as (un)supported is unsupported now. :-(
I'm *not* being wise-ass, just acknowledging that sometimes all locally available (excluding "read the SOURCE Luke!") references does not tell us all that we may need to know. So I go googling... and see refs all over the place that indicate noatime is being used.
There are admonishments, such as "no -o", "separate options with commas and no spaces", etc.
Maybe there is an answer in the Googlevers? Maybe support for it has been dropped (I hope not).
Last,
$ rpm -q --changelog nfs-utils|grep -i atime
"No joy in Mudville".
Anyway, my curiosity has *not* been satisfied, but I've got other interests pulling at me now.
BTW: man mount points out that some of the commands are only effective when invoked within the /etc/fstab. JIC.
regards, Bent
<snip sig stuff>
Good luck with it.
On Jan 31, 2008 1:17 PM, William L. Maltby CentOS4Bill@triad.rr.com wrote:
Good luck with it.
Thanks mate - "do no believe in miracles - rely on them!"
On Jan 31, 2008 1:17 PM, William L. Maltby CentOS4Bill@triad.rr.com wrote:
Still, we can't believe that an option you've never changed, and apparently worked before, and is not specifically (in)excluded as (un)supported is unsupported now. :-(
No I suppose not, so we went and rechecked with noatime, and the situation remains the same.
/B
On Jan 31, 2008 2:09 PM, Bent Terp bent@nagstrup.dk wrote:
On Jan 31, 2008 1:17 PM, William L. Maltby CentOS4Bill@triad.rr.com wrote:
Still, we can't believe that an option you've never changed, and apparently worked before, and is not specifically (in)excluded as (un)supported is unsupported now. :-(
No I suppose not, so we went and rechecked with noatime, and the situation remains the same.
/B
ehrmn, I meant "rechecked WITHOUT noatime and nodiratime" ;-)
On Jan 30, 2008 5:25 PM, Johnny Hughes johnny@centos.org wrote:
In any event, I can not duplicate the problem with an nfs export on c4 or c5 and connecting with a c5 client, regardless of the kernel using i686.
Good point, thanks Johnny! We've verified that here; problem does not occur when mounting a Linux nfs-share, and does occur when mounting a Celerra nfs-share.
I've opened a Service Request @ EMC, and will post here again when relevant.
Thank you for helping us - with this issue in particular, and with making CentOS happen in general!
BR Bent
Bent Terp wrote:
On Jan 30, 2008 5:25 PM, Johnny Hughes johnny@centos.org wrote:
In any event, I can not duplicate the problem with an nfs export on c4 or c5 and connecting with a c5 client, regardless of the kernel using i686.
Good point, thanks Johnny! We've verified that here; problem does not occur when mounting a Linux nfs-share, and does occur when mounting a Celerra nfs-share.
I've opened a Service Request @ EMC, and will post here again when relevant.
Thank you for helping us - with this issue in particular, and with making CentOS happen in general!
You're welcome.
There seems to be something about the new kernel that causes many more "client rpc calls" and "nfs v3 client lookups" for some (but not all) operations. I have been able to reproduce (as have others) the issues that seem to cause the problem on i686 and x86_64 regardless of the backend server, however it seems to more pronounced on x86_64 clients.
Whether or not it has a major effect will depend on the volume of individual actions performed per time. The more actions per second, the bigger the impact (it seems).
I did not see a major impact on performance on i686 (15 seconds on a 3.5 min operation), though I did see the issues in nfsstat ... however on x86_64 it did seem to cause more time issues. Also, I was doing one controlled operation, so if many of these where happening at the same time it might have a different impact.
In any event, I have posted an upstream bug to address this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=431092
Hopefully we can get it resolved.
Thanks, Johnny Hughes
On Feb 1, 2008 10:54 AM, Johnny Hughes johnny@centos.org wrote:
Bent Terp wrote:
Good point, thanks Johnny! We've verified that here; problem does not occur when mounting a Linux nfs-share, and does occur when mounting a Celerra nfs-share.
Tunrs out that nfsstat wasn't telling us the whole truth.... We set up an rsync that only did the directory listing, and the .4 => .6 kernel "opgrade" (and I use the term loosely...) resulted in that rsync command taking 21 secs instead of 4.5 against a Linux nfs backend; and 20 secs instead of 10 against the celerra.
I've opened a Service Request @ EMC, and will post here again when relevant.
Issue remains open, although I'm sligthly embarassed about it now, given that linux backends are also affected.
When we built a .6 kernel without the 5 nfs patches, nfsstat output reverted, but I don't know about the actual performance, yet. Probably we can rerun those tests monday.
BR Bent
Bent Terp wrote:
Issue remains open, although I'm sligthly embarassed about it now, given that linux backends are also affected.
When we built a .6 kernel without the 5 nfs patches, nfsstat output reverted, but I don't know about the actual performance, yet. Probably we can rerun those tests monday.
If possible try to add your findings to https://bugzilla.redhat.com/show_bug.cgi?id=431092 so upstream can fix that bug.
Thank you,
Ralph
In article 77c4f5c60801240530r759074c0vd52e5729f8688c42@mail.gmail.com, Bent Terp bent@nagstrup.dk wrote:
Just a word of warning: after updating a few of our x86_64 based web frontend boxes to the new kernel, we began to get weird MySQL timeouts. The problem went away again when we downgraded to the previous kernel-2.6.18-53.1.6.el5.x86_64.rpm
Your subject line says kernel-2.6.18-53.1.6.el5.x86_64.rpm causes issues with MySQL, but the body of your message says that is the version of kernel that you had to revert to to solve the problems!
Could you clarify which kernel version gives problems and which doesn't?
Cheers Tony