[CentOS] hwclock problem

Mon Nov 15 20:45:18 UTC 2010
Todd Denniston <Todd.Denniston at tsb.cranrdte.navy.mil>

Jobst Schmalenbach wrote, On 11/11/2010 07:41 PM:
> Hi.
> 
> I run peridocally (from cron) on all of my machines
> 
>   30 * * * * root /sbin/hwclock --systohc
> 

Why?
AFAIK a kernel that is running ntpd and ntpd thinks has reasonably synced to the NTP server will,
every _eleven_ minutes write the system time to the hardware clock, and you can't stop it without
modifying the kernel or ntpd.

> All of those machines in question take their time via NTP
> from the same local server, and that server gets its time
> from a ntp pool.
> 

reasonable NTP setup.

> Now I had to reboot a couple of them two days ago and to my surprise
> all had problems with the time upon booting.
> 
> Here are the important files:
> 
> [root at XXXXXX ~] #>l /etc/adjtime 
> 0.001687 1289518202 0.000000
> 1289518202
> LOCAL
> 
> [root at XXXXXXX ~] #>l /etc/sysconfig/clock 
> ZONE="Australia/Melbourne"
> UTC=false
> ARC=false
> 
> So from my understanding the hwclock should contain the local time.
> 
> [root at XXXXXX ~] #>date
> Fri Nov 12 11:26:23 EST 2010
> [root at XXXXXX ~] #>hwclock
> Fri 12 Nov 2010 11:26:42 EST  -0.167976 seconds
> [root at XXXXXX ~] #>
> 

Is 'EST' the time zone abbreviation you expect for Melbourne?
As I am based in the US, I expect 'EST' to be "Eastern Standard Time" for New York/New York, so I
ask for your help in understanding.

We might be able to see a different pattern if we take the TZ out of the equations.
date -u ; hwclock --show --utc; date -u
date ; hwclock --show ; date



> However on boot I get the following:
> 
> Nov 10 19:08:37 XXXXXX syslogd 1.4.1: restart.
> Nov 10 19:08:37 XXXXXX kernel: klogd 1.4.1, log source = /proc/kmsg started.
> Nov 10 19:08:37 XXXXXX kernel: Linux version 2.6.18-164.11.1.el5 (mockbuild at builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.
> 1.2-46)) #1 SMP Wed Jan 20 07:32:21 EST 2010
> Nov 10 19:08:37 XXXXXX kernel: Command line: ro root=/dev/sda2 vga=791
> Nov 10 19:08:37 XXXXXX kernel: BIOS-provided physical RAM map:
> ...
> ...
> Nov 10 19:08:51 XXXXXX kernel: IPv6 over IPv4 tunneling driver
> Nov 10 08:08:52 XXXXXX ntpdate[2464]: step time server 192.168.1.1 offset -39599.950905 sec
> Nov 10 08:08:52 XXXXXX xinetd[2447]: xinetd Version 2.3.14 started with libwrap loadavg labeled-networking options compiled in.
> 
> and off course dovecot falls over too "Time just moved backwards by 39599 seconds."
> 
> Now, 39600s is 11 hours, which is (inc DST) *MY* offset from Greenwich.
> 
> 
> So what am I doing wrong?

Running a Linux _Server_ as if it had to dual boot with windows.
i.e. the hardware clock should be kept in UTC unless you need to boot the same machine with windows.

> The idea of running hwclock is to make sure that 
> exactly the problem with dovecot does NOT occur, 
> and ntp does not have a coughing fit when the hardware 
> clock is not close to the correct time upon booting.

The standard start script (/etc/rc.d/init.d/ntpd) does a ntpdate before running (which is what you
see in your log above) to keep ntp from "coughing".


> The last time I booted some of those machine was more than 200 days ago, 
> so the hwclock will be skewed if I do not update it.

I *WAS* beginning to think like the others, that the TZ file used by hwclock and by date don't match.

However, I now *believe* I KNOW the source of the delta!
IIRC the kernel magic (write system time to HC every eleven minutes) I was writing about earlier ...
I don't think takes into account the local TZ, i.e., it ALWAYS works UTC. I would have to read the
kernel source again to prove it, or suggest to you to try the following:

1) *remove* your cron job that called hwclock, because it is and will cause problems.
2) let the machine sync with the NTP server
i.e.,
ntpdc -c kern |grep status
returns something like:
status:               0009  pll fll
2a) wait 12 minutes.
3) run:
date -u ; hwclock --show --utc; date -u ; \
date ; hwclock --show ; date
4) run
hwclock --systohc; \
date -u ; hwclock --show --utc; date -u ; \
date ; hwclock --show ; date
5) wait 23 more minutes
6) run
date -u ; hwclock --show --utc; date -u ; \
date ; hwclock --show ; date

if at 3 and 6 the utc versions of date and hwclock are in sync, then it is the ntpd synced kernel
that is setting a utc time into the hwclock and you need to change the last line in /etc/adjtime to
UTC instead of LOCAL.

Otherwise a bit more thinking is in order.

good luck.
-- 
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter