On 03/22/2012 08:24 AM, m.roth@5-cent.us wrote:
mark wrote:
On 03/21/12 19:50, Adam Wead wrote:
On Wed, Mar 21, 2012 at 4:40 PM,m.roth@5-cent.us wrote:
I just updated one of our servers to 5.8, and rebooted. In the logs, I saw a bunch of Mar 21 16:29:02<server> rpc.statd[9783]: recv_rply: can't decode RPC message! Mar 21 16:29:33<server> last message repeated 442 times Mar 21 16:30:34<server> last message repeated 835 times Mar 21 16:31:36<server> last message repeated 884 times Mar 21 16:32:38<server> last message repeated 856 times Mar 21 16:32:44<server> last message repeated 111 times
I tried restarting nfslock, and that *appears* to have fixed it. Googling, I found a thread about that at http://nerdbynature.de/s9y/archives/2009/08.html, which suggests that it's starting too early, possibly before portmap is running.
Anyone else see this? Has an old bug snuck back in?
There's a NFS bug with the latest kernel:
https://bugzilla.redhat.com/show_bug.cgi?id=798809
Reboot into your previous kernel and that should fix it.
Great - but I've just updated a server I've missed, that's been "we're too busy to let you do it" until now, and it would take it back to 5.7, at least. I suppose I can yum downgrade....
Following myself up - I didn't look at the bugzilla link earlier - updated t-bird at home the other day, and the click link to open it in browser doesn't work - but looked at it here, and it doesn't seem to be related - this is a backup server, and only had a home directory mounted when I ssh'd in. It does appear to have been the case suggested in the thread I've mentioned - there's no entry in the logfile after I restarted nfslock.
mark
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
I run into these startup timing issues all the time on many linux distributions. Upstart was supposed to be an attempt to address these issues in Redhat/CentOS 6, but the hybrid startup process that has resulted from a partial transition to upstart is both confusing and sometimes makes the problem worse. I suspect the timing issues are related also to the speed and number of processors on your system.
I've solved these problems in several different ways:
For CentOS 5, if you don't mind changing the number on the init script, you can cause it to start later in the startup process. Sometimes this isn't enough. In some cases I've solved the problem by creating my own init script which has a sleep command in it and then either starts or restarts the selected component after a fixed time delay. Note that the init script must fire up a shell that runs in the background and then runs the restart command after the specified time. Maybe not so elegant, but it works.
In CentOS 6 you can just create an upstart job with the correct dependencies.
Nataraj