I had a similar problem on a different server that I fixed last night. Evidently it had a BIOS level feature that tried to modify the CPU clock rate, much like cpu-freq does within the kernel, and was doing so by messing with the system clock impacting the RTC. I was drifting all over the place until I found and disabled that feature (foxcon board, something like foxstep I believe is what it is called in BIOS). Not sure if your lenovo boards have that feature, but i know that some ASUS boards do.
Jason www.cyborgworkshop.org
Paul Heinlein wrote:
On Tue, 20 May 2008, Alfred von Campe wrote:
I have 30 identical Lenovo desktop systems running CentOS 5.1. On one of those systems the clock is running slow (5+ minutes from yesterday to this morning and another minute since this morning) despite the fact that NTP is running on all of them and they all have the exact same /etc/ntp.conf file (I compared the MD5 sums of that file on all the systems). Here is the output of "grep ntp /var/log messages" on the system with the problem since I restarted the NTP daemon earlier today:
A slew of 5 min/24 hrs should be in the range of fixable.
May 20 11:35:38 hepdsw03 ntpd[31792]: frequency initialized 0.000 PPM from /var/lib/ntp/drift
This is very suspect. Are there any SELinux or other log messages suggesting that ntpd isn't able to write to its drift file? Your local clock is definitely drifting, so a 0.000 value is bogus. It may indicate that there's a disconnect between ntpd and the filesystem.
I'd be interested in the output of "ntpdc -c kerninfo"; on most systems the 'pll frequency' value is a close match to the figure in the drift file.
May 20 11:38:55 hepdsw03 ntpd[31792]: synchronized to LOCAL(0), stratum 10 May 20 11:38:55 hepdsw03 ntpd[31792]: kernel time sync disabled 0001 May 20 11:39:59 hepdsw03 ntpd[31792]: synchronized to 10.101.32.104, stratum 3
This is ungood. Sync-ing to local before your network time server means that your machine doesn't want to believe your server -- and you should see a "kernel time sync enabled" message once the machine has sync-ed with the time server.
You said the machines are identical. Could there be any variation in the BIOS revision level or its settings? Sometimes ACPI stuff can mess up ntp.
Also -- the log messages you provide have no "step time server" reference. Do you have a valid /etc/ntp/step-tickers file?