On 05/09/2011 06:53 PM, Brandon Ooi wrote:
On Mon, Apr 25, 2011 at 12:47 PM, Denniston, Todd A CIV NAVSURFWARCENDIV Crane <todd.denniston@navy.mil mailto:todd.denniston@navy.mil> wrote:
> -----Original Message----- > From: centos-bounces@centos.org <mailto:centos-bounces@centos.org> [mailto:centos-bounces@centos.org <mailto:centos-bounces@centos.org>] On > Behalf Of Mailing List > Sent: Monday, April 25, 2011 13:57 > To: CentOS mailing list > Subject: Re: [CentOS] CentOs 5.6 and Time Sync > > > > List, > > I was not able to resolve my issue with the time on this machine. > I > went ahead and rolled the update back to 5.5 and disabled the update to > 5.6. > > What I would like to know is if CentOS 6 might be ok when it rolls > out, or am I just going to have to keep with 5.5 till EOL? > > Thanks to all with there help. > 1) I hope you are only talking about having rolled back to the last working for you kernel from 5.5, not the whole distribution. 2) If I was in your position and had time, my method would be[1] a) get the srpm for the last known working kernel (2.6.18-194.32 ???) b) get the srpm for the first known not working kernel (2.6.18-238 ???) c) expand each of the above srpms into their own rpm build tree i.e., rpmdev-setuptree;rpm -i kern1; mv rpmbuild rpmbuild.kern1; rpmdev-setuptree;rpm -i kern2; mv rpmbuild rpmbuild.kern2 d) start looking at the differences in the patches applied in kern1 vs. those in kern2, i.e., read/diff the kernel.spec files see if there were any new ones that seemed likely to be causing the problem... RTFS if necessary to make better guesses. Rebuild kernel 2 with patches taken out/modified based on my investigations and test them and see if I guessed right. If no luck, think about opening an TUV bug with lots of the info you have sent here, they may be interested even if you don't have a subscription. [1] Been there, done that: http://www.gossamer-threads.com/lists/drbd/users/9616
At first I figured this was misconfigured NTP but I actually see this happening on one of my machines as well. Nothing interesting about it in particular but I verified that rolling back to the previous kernel (2.6.18-194.32.1.el5) solves the problem entirely. This happens when NTP is enabled or disabled. I get the following error messages in dmesg which are possibly related.
time.c: can't update CMOS clock from 59 to 0 time.c: can't update CMOS clock from 59 to 0 time.c: can't update CMOS clock from 59 to 0 time.c: can't update CMOS clock from 59 to 0
The time drift is significantly higher than would be expected as normal. Because rolling back the kernel completely solves this issue, this must be a bug.
[root@nexus4 ~]# date; ntpdate -u pool.ntp.org http://pool.ntp.org Mon May 9 16:51:03 PDT 2011 9 May 16:50:21 ntpdate[22117]: step time server 207.182.243.123 offset -42.418572 sec
[root@nexus4 ~]# date; ntpdate -u pool.ntp.org http://pool.ntp.org Mon May 9 16:50:33 PDT 2011 9 May 16:50:35 ntpdate[22127]: step time server 207.182.243.123 offset -0.692146 sec
Yes, this is obviously a problem with the kernel interacting with the clock on some machines. IF we can figure out which ones and why, we can get upstream to fix it.