NTP and hardware clock

List overview All Threads
Download

newer

older

RE: [CentOS] Server hangs...

mbox corruption

Dag Wieers

11 Oct 2006 11 Oct '06

10:26 p.m.

Hi,

I had the following problem today. Because of a misconfigured network switch one system suddenly didn't have any network.

After a reboot (with the network still unavailable) NTPD refused to start. Most likely because the initial ntpdate failed to work. I find this troubling, because when the network was restored, NTPD could have resumed working (like I'd expect from a true daemon).

Now, what was more peculiar was that the hardware clock was completely off. I also had assumed that somehow the hardware clock was kept in sync, but now after rebooting without network, the system clock was skewed.

Is there some way to:

+ Make ntpd run, even when no ntp-server could be contacted + Make ntpd synchronise the hardware clock automatically

PS Yes, I know I can run ntpdate from cron or run hwclock to synchronize my hardware clock. But shouldn't this be part of the infrastructure (either ntpd or the initscripts) ?

Maybe this is useful to have fixed upstream, but I prefer to hear second opinions before trying to be smart :)

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]

Show replies by date

mike.redan＠bell.ca

11 Oct 11 Oct

10:32 p.m.

This was partially fixed upstream recently: https://rhn.redhat.com/errata/RHSA-2006-0393.html

At least the initscripts will sync the hwclock now out of the box.

As for NTPD not starting if it's server is not reachable. Hrm. I haven't really seen that happen in my setup. And it is quite often that the NTP server is not reachable at startup. Perhaps it is just a change needed in ntp.conf?

Mike

-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Dag Wieers Sent: October 11, 2006 12:57 PM To: centos@centos.org Subject: [CentOS] NTP and hardware clock

Hi,

I had the following problem today. Because of a misconfigured network switch one system suddenly didn't have any network.

Is there some way to:

+ Make ntpd run, even when no ntp-server could be contacted + Make ntpd synchronise the hardware clock automatically

PS Yes, I know I can run ntpdate from cron or run hwclock to synchronize

my hardware clock. But shouldn't this be part of the infrastructure (either ntpd or the initscripts) ?

Maybe this is useful to have fixed upstream, but I prefer to hear second

opinions before trying to be smart :)

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power] _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Dag Wieers

10:50 p.m.

On Wed, 11 Oct 2006, mike.redan@bell.ca wrote:

...

This was partially fixed upstream recently: https://rhn.redhat.com/errata/RHSA-2006-0393.html

At least the initscripts will sync the hwclock now out of the box.

That's great, but it seems only for RHEL4, not RHEL3 or RHEL2.1 :(

...

As for NTPD not starting if it's server is not reachable. Hrm. I haven't really seen that happen in my setup. And it is quite often that the NTP server is not reachable at startup. Perhaps it is just a change needed in ntp.conf?

Not that I have found. It first tries to do an ntpdate, if that one fails it doesn't start ntpd. In my opinion that is not correct behaviour.

The initial ntpdate is to make sure the local time is not totally skewed, but if you periodically synchronize the hwclock (which in my opinion ntpd should do), then it is not really required.

I'm going to open it as a bug on bugzilla.

Thanks a lot for the reference !

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]

Kirk Bocek

10:36 p.m.

Dag, Do you have the following lines in your /etc/ntp.conf:

server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10

They identify your local clock as a low-stratum time server.

Kirk Bocek

Dag Wieers wrote:

...

Hi,

I had the following problem today. Because of a misconfigured network switch one system suddenly didn't have any network.

After a reboot (with the network still unavailable) NTPD refused to start. Most likely because the initial ntpdate failed to work. I find this troubling, because when the network was restored, NTPD could have resumed working (like I'd expect from a true daemon).

Now, what was more peculiar was that the hardware clock was completely off. I also had assumed that somehow the hardware clock was kept in sync, but now after rebooting without network, the system clock was skewed.

Is there some way to:

Make ntpd run, even when no ntp-server could be contacted

Make ntpd synchronise the hardware clock automatically

PS Yes, I know I can run ntpdate from cron or run hwclock to synchronize my hardware clock. But shouldn't this be part of the infrastructure (either ntpd or the initscripts) ?

Maybe this is useful to have fixed upstream, but I prefer to hear second opinions before trying to be smart :)

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power] _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Dag Wieers

10:52 p.m.

On Wed, 11 Oct 2006, Kirk Bocek wrote:

...

Kirk Bocek Dag Wieers wrote:

...
I had the following problem today. Because of a misconfigured network switch one system suddenly didn't have any network.

After a reboot (with the network still unavailable) NTPD refused to start. Most likely because the initial ntpdate failed to work. I find this troubling, because when the network was restored, NTPD could have resumed working (like I'd expect from a true daemon).

Now, what was more peculiar was that the hardware clock was completely off. I also had assumed that somehow the hardware clock was kept in sync, but now after rebooting without network, the system clock was skewed.

Is there some way to:

Make ntpd run, even when no ntp-server could be contacted

Make ntpd synchronise the hardware clock automatically

PS Yes, I know I can run ntpdate from cron or run hwclock to synchronize my hardware clock. But shouldn't this be part of the infrastructure (either ntpd or the initscripts) ?

Maybe this is useful to have fixed upstream, but I prefer to hear second opinions before trying to be smart :)

Do you have the following lines in your /etc/ntp.conf:

server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10

They identify your local clock as a low-stratum time server.

I fail to see how that is relevant, since the local clock is wrong after a reboot without network (so I rather not want to use it as a source :)) and ntpd is not even started because ntpdate fails.

But yes, I do have something like that (stratum 13 though).

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]

Kirk Bocek

10:57 p.m.

Dag Wieers wrote:

...

I fail to see how that is relevant, since the local clock is wrong after a reboot without network (so I rather not want to use it as a source :)) and ntpd is not even started because ntpdate fails.

But yes, I do have something like that (stratum 13 though).

I somehow thought that ntpdate might be trying to access an unavailable time source. With these lines, you always have a time source available, even if it is inaccurate.

Kirk Bocek

Dag Wieers

11:11 p.m.

On Wed, 11 Oct 2006, Kirk Bocek wrote:

...

Dag Wieers wrote:

...
I fail to see how that is relevant, since the local clock is wrong after a reboot without network (so I rather not want to use it as a source :)) and ntpd is not even started because ntpdate fails.

But yes, I do have something like that (stratum 13 though).

I somehow thought that ntpdate might be trying to access an unavailable time source. With these lines, you always have a time source available, even if it is inaccurate.

The whole point of ntpdate is to synchronize the local clock with another source (ie. not the local clock).

And I guess the main reason why they do not start ntpd if ntpdate fails, is because they have to protect other ntp clients from being poisoned by a wrong system clock upstream (because its source is unavailable).

So maybe using ntpdate from cron is better than using ntpd. If only there was some infrastructure so configuring ntpdate would be done in a standardized way.

Maybe that would be a nice feature request for Red Hat. Add a cron-job that works if some /etc/sysconfig/ stuff is available.

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]

Kirk Bocek

11:16 p.m.

Dag Wieers wrote:

...

The whole point of ntpdate is to synchronize the local clock with another source (ie. not the local clock).

Yes, I understand that. I've always seen the use of the local clock as a 'fudge', if you will. From NTP's point of view, the battery-driven quartz clock on the motherboard is just one many possible time sources.

One more thing, have you checked the time set in the local clock? NTP will give up if it is more than about an hour different from your time sources. Although I don't think it should hang...

mike.redan＠bell.ca

11:25 p.m.

The whole point of ntpdate is to synchronize the local clock with another source (ie. not the local clock).

And I guess the main reason why they do not start ntpd if ntpdate fails,

is because they have to protect other ntp clients from being poisoned by

a wrong system clock upstream (because its source is unavailable).

Hrm. What does your initscript look like for ntpd? Ie which version? I just had a look around at some RHEL3.x and 4.x (as well as some CentOS ones) and none of them will refuse to start ntpd if the ntpdate run fails. The only thing they do when ntpdate fails is add the "-g" option, which will let ntpd jump the clock more than 1000s. But no matter what, ntpd will start.

Mike

Dag Wieers

11:53 p.m.

On Wed, 11 Oct 2006, mike.redan@bell.ca wrote:

...

...
The whole point of ntpdate is to synchronize the local clock with another source (ie. not the local clock).

And I guess the main reason why they do not start ntpd if ntpdate fails, is because they have to protect other ntp clients from being poisoned by a wrong system clock upstream (because its source is unavailable).

Hrm. What does your initscript look like for ntpd? Ie which version? I just had a look around at some RHEL3.x and 4.x (as well as some CentOS ones) and none of them will refuse to start ntpd if the ntpdate run fails. The only thing they do when ntpdate fails is add the "-g" option, which will let ntpd jump the clock more than 1000s. But no matter what, ntpd will start.

You're correct. Then I have no idea why ntpd was not running.

Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]

Stephen John Smoogen

12 Oct 12 Oct

12:11 a.m.

On 10/11/06, Dag Wieers dag@wieers.com wrote:

...

On Wed, 11 Oct 2006, mike.redan@bell.ca wrote:

...
...
The whole point of ntpdate is to synchronize the local clock with another source (ie. not the local clock).

And I guess the main reason why they do not start ntpd if ntpdate fails, is because they have to protect other ntp clients from being poisoned by a wrong system clock upstream (because its source is unavailable).

Hrm. What does your initscript look like for ntpd? Ie which version? I just had a look around at some RHEL3.x and 4.x (as well as some CentOS ones) and none of them will refuse to start ntpd if the ntpdate run fails. The only thing they do when ntpdate fails is add the "-g" option, which will let ntpd jump the clock more than 1000s. But no matter what, ntpd will start.

You're correct. Then I have no idea why ntpd was not running.

/me puts on his ntpd hat.

ntpd will not start running if it finds it can't make a gradual change to the clock to bring it into sync. This occurs when the clock is over 1000s or the TOY chip is not responding in a way that the ntpd knows how. [If hwclock --systohc says it sets the clock but you see it doesnt.. then it can be a hardware problem of many types.]

The most common reason I found ntpd not running is that it found its time all of a sudden over 1000s for some reason it didnt know about (changing timezone on the box or bad hz rate from hardware.)

-- Stephen J Smoogen. -- CSIRT/Linux System Administrator How far that little candle throws his beams! So shines a good deed in a naughty world. = Shakespeare. "The Merchant of Venice"

Grant McChesney

11 Oct 11 Oct

11:27 p.m.

On 10/11/06, Dag Wieers dag@wieers.com wrote:

...

Hi,

I had the following problem today. Because of a misconfigured network switch one system suddenly didn't have any network.

After a reboot (with the network still unavailable) NTPD refused to start. Most likely because the initial ntpdate failed to work. I find this troubling, because when the network was restored, NTPD could have resumed working (like I'd expect from a true daemon).

I too have similar complaints with NTPD on CentOS 3. If any of my CentOS 3 servers lose power, NTPD refuses to start on next boot. If I check the status on the ntpd process, it says process is dead but pid file exists. Server time changes to hwclock, which is usually off 1 hour thanks to daylight savings. Interestingly enough I have never had the problem on a CentOS 4 server.

...

Now, what was more peculiar was that the hardware clock was completely off. I also had assumed that somehow the hardware clock was kept in sync, but now after rebooting without network, the system clock was skewed.

Is there some way to:

Make ntpd run, even when no ntp-server could be contacted

Make ntpd synchronise the hardware clock automatically

PS Yes, I know I can run ntpdate from cron or run hwclock to synchronize my hardware clock. But shouldn't this be part of the infrastructure (either ntpd or the initscripts) ?

That would be a nice feature in the initscript. I've settled for the cron fix for now to keep my hwclock in sync.

Grant

John Newbigin

12 Oct 12 Oct

7:49 a.m.

If you want to use NTP, they you should store GMT in the hardware clock, otherwise you might end up with windows style 'my clock is an hour out' bugs. Needless to say, this does not work with a dual-boot linux/windows box.

The RH approach of setting the time with ntpdate before starting ntpd is IMHO wrong but it does not cause me enough trouble to worry about fixing it.

As for the original problem, ntpd probably relies on DNS to find the servers and if it can't find any servers it may fail to start. This is normal daemon behavior, normal for apache anyway.

John.

Grant McChesney wrote:

...

On 10/11/06, Dag Wieers dag@wieers.com wrote:

...
Hi,

I had the following problem today. Because of a misconfigured network switch one system suddenly didn't have any network.

After a reboot (with the network still unavailable) NTPD refused to start. Most likely because the initial ntpdate failed to work. I find this troubling, because when the network was restored, NTPD could have resumed working (like I'd expect from a true daemon).

I too have similar complaints with NTPD on CentOS 3. If any of my CentOS 3 servers lose power, NTPD refuses to start on next boot. If I check the status on the ntpd process, it says process is dead but pid file exists. Server time changes to hwclock, which is usually off 1 hour thanks to daylight savings. Interestingly enough I have never had the problem on a CentOS 4 server.

...
Now, what was more peculiar was that the hardware clock was completely off. I also had assumed that somehow the hardware clock was kept in sync, but now after rebooting without network, the system clock was skewed.

Is there some way to:

Make ntpd run, even when no ntp-server could be contacted

Make ntpd synchronise the hardware clock automatically

PS Yes, I know I can run ntpdate from cron or run hwclock to synchronize my hardware clock. But shouldn't this be part of the infrastructure (either ntpd or the initscripts) ?

That would be a nice feature in the initscript. I've settled for the cron fix for now to keep my hwclock in sync.

Grant _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

-- John Newbigin Computer Systems Officer Faculty of Information and Communication Technologies Swinburne University of Technology Melbourne, Australia http://www.ict.swin.edu.au/staff/jnewbigin

6838

Age (days ago)

6839

Last active (days ago)

discuss@lists.centos.org

12 comments

6 participants

tags (0)

participants (6)

Dag Wieers
Grant McChesney
John Newbigin
Kirk Bocek
mike.redan＠bell.ca
Stephen John Smoogen