We have a local time server and all of our machines are pointed at it for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - - - Jason Pyeron PD Inc. http://www.pdinc.us - - Principal Consultant 10 West 24th Street #100 - - +1 (443) 269-1555 x333 Baltimore, Maryland 21218 - - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This message is copyright PD Inc, subject to license 20080407P00.
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are pointed at it for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted forward in time if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages which will help you determine what happened to the time. Also check that ntpd is running with:
"service ntpd status" and also "chkconfig ntpd --list" will show the startup position of ntpd
HTH
Simon.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Simon Billis Sent: Thursday, August 12, 2010 7:36 To: 'CentOS mailing list' Subject: Re: [CentOS] Date drift and ntpd
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are
pointed at it
for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted forward in time if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages </snip> Jul 28 20:34:41 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 21:08:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:08:00 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:08:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 21:24:58 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 21:41:26 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 28 21:42:16 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:42:16 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:42:34 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:43:37 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:44:41 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:45:44 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:46:45 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:47:50 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:48:55 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:49:57 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:50:59 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:52:03 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:53:05 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:54:06 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:55:10 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:56:13 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:57:16 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:58:20 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:59:23 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:00:28 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:01:32 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:02:35 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:03:38 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:04:41 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:05:44 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:06:49 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:07:53 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:08:57 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:10:00 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:11:03 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:12:07 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:13:13 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:14:17 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:15:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 22:31:41 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 22:31:41 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:31:59 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 23:05:10 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 28 23:05:10 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 23:06:05 devserver21 ntpd[3475]: time reset +0.554019 s Jul 28 23:10:14 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 23:17:36 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 28 23:20:46 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 23:22:52 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 23:33:28 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 23:34:37 devserver21 ntpd[3475]: time reset -0.866445 s Jul 28 23:38:49 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 23:43:01 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 23:44:03 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 00:00:57 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 00:25:53 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 00:41:48 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 00:42:45 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 00:42:44 devserver21 ntpd[3475]: time reset -0.922073 s Jul 29 00:46:58 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 00:57:27 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 01:07:55 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 01:57:05 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 02:13:48 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 02:13:52 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 02:30:31 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 03:03:59 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 03:04:00 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 03:04:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 03:37:21 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 04:10:46 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 04:44:06 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 05:00:48 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 05:00:52 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 05:17:30 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 05:34:13 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 06:24:16 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 06:40:59 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 07:30:59 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 07:47:42 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 07:47:53 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 08:04:23 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 08:37:47 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 08:37:58 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 09:11:03 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 09:27:43 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 09:27:44 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 09:28:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 10:17:40 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 10:50:57 devserver21 ntpd[3475]: time reset -1.638135 s Jul 29 10:55:08 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 10:58:17 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 10:59:16 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 11:07:41 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 11:11:57 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 11:13:58 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 11:19:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 11:21:19 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 11:29:46 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 11:39:57 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 11:41:56 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 11:44:23 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 12:03:03 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 12:05:19 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 12:23:28 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 12:27:48 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 13:34:13 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 13:51:08 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 14:03:08 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 14:07:42 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 14:23:56 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 14:40:29 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 14:57:07 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 14:57:27 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 15:13:40 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 15:14:01 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 15:26:05 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 15:59:17 devserver21 ntpd[3475]: time reset -1.599691 s Jul 29 16:03:31 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 16:05:38 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 16:08:46 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 16:11:55 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 16:12:59 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 16:15:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 16:28:32 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 16:41:10 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 16:57:35 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 17:23:57 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 17:24:59 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 17:30:46 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 17:47:24 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not permitted [root@devserver21 ~]# uptime 08:10:19 up 164 days, 9:56, 2 users, load average: 0.20, 0.54, 0.81 [root@devserver21 ~]#
which will help you determine what happened to the time. Also check that ntpd is running with:
"service ntpd status" and also "chkconfig ntpd --list" will show the startup position of ntpd
It is/was up.
-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - - - Jason Pyeron PD Inc. http://www.pdinc.us - - Principal Consultant 10 West 24th Street #100 - - +1 (443) 269-1555 x333 Baltimore, Maryland 21218 - - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This message is copyright PD Inc, subject to license 20080407P00.
Hi,
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are
pointed at it
for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted forward in time if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages
</snip>
/SNIP
Jul 29 17:47:24 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not permitted [root@devserver21 ~]# uptime 08:10:19 up 164 days, 9:56, 2 users, load average: 0.20, 0.54, 0.81 [root@devserver21 ~]#
What happened between July 29 and now? Is there nothing in the logs for that period?
Rgds
S.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Simon Billis Sent: Thursday, August 12, 2010 8:14 To: 'CentOS mailing list' Subject: Re: [CentOS] Date drift and ntpd
Hi,
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are
pointed at it
for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# |
grep -v ^$
restrict default nomodify notrap noquery restrict
127.0.0.1 server
192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted
forward in time
if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages </snip>
/SNIP
Jul 29 17:47:24 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not permitted [root@devserver21 ~]# uptime 08:10:19 up 164 days, 9:56, 2 users, load average: 0.20, 0.54, 0.81 [root@devserver21 ~]#
What happened between July 29 and now? Is there nothing in the logs for that period?
Nothing of note, I do have full logs from those days... [root@devserver21 ~]# grep ' 00:00:.. devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd' /var/log/messages Jul 26 00:00:37 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Jul 27 00:00:47 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Jul 28 00:00:44 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Jul 29 00:00:41 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Jul 30 00:00:08 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Jul 31 00:00:08 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 1 00:00:14 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 2 00:00:22 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 3 00:00:11 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 4 00:00:58 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 5 00:00:06 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 6 00:00:06 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 7 00:00:03 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 8 00:00:43 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 9 00:00:16 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 10 00:00:08 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 11 00:00:34 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 12 00:00:19 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd Aug 13 00:00:20 devserver21 arpwatch: bogon 192.168.1.67 0:30:67:0:cf:fd [root@devserver21 ~]#
-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - - - Jason Pyeron PD Inc. http://www.pdinc.us - - Principal Consultant 10 West 24th Street #100 - - +1 (443) 269-1555 x333 Baltimore, Maryland 21218 - - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This message is copyright PD Inc, subject to license 20080407P00.
Hi,
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are
pointed at it
for the time.
How can the clock drift by a day and a half?
/SNIP
It is unlikely that the machine in question drifted forward in time if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages
</snip> Jul 28 20:34:41 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 21:08:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:08:00 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM
This indicates the hardware clock frequency error exceeds the rate the kernel can correct. This could be a hardware or a kernel problem.
/SNIP
Jul 28 23:06:05 devserver21 ntpd[3475]: time reset +0.554019 s Jul 28 23:10:14 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 23:17:36 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 28 23:20:46 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 23:22:52 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 23:33:28 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 23:34:37 devserver21 ntpd[3475]: time reset -0.866445 s
/SNIP
Jul 29 00:42:44 devserver21 ntpd[3475]: time reset -0.922073 s
/SNIP
Jul 29 10:50:57 devserver21 ntpd[3475]: time reset -1.638135 s
/SNIP
Jul 29 15:59:17 devserver21 ntpd[3475]: time reset -1.599691 s
/SNIP
The above lines show that the time on the server was gaining slightly - but this could be caused by the stratum 3 server losing time slightly due to loading issues perhaps or by a hardware fault locally
Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not permitted
I suspect that you have a firewall in place that is blocking the outgoing connections from this point.
Rgds
S.
Jason Pyeron wrote, On 08/12/2010 08:01 AM:
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Simon Billis Sent: Thursday, August 12, 2010 7:36 To: 'CentOS mailing list' Subject: Re: [CentOS] Date drift and ntpd
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are
pointed at it
for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted forward in time if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages
</snip> Jul 28 20:34:41 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 21:08:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:08:00 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:08:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 21:24:58 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 28 21:41:26 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 28 21:42:16 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:42:16 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:42:34 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:43:37 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM
tolerance 500 PPM Jul 28 22:12:07 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:13:13 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:14:17 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:15:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 22:31:41 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 22:31:41 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM
Jul 29 15:14:01 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 15:26:05 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 15:59:17 devserver21 ntpd[3475]: time reset -1.599691 s Jul 29 16:03:31 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 16:05:38 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 16:08:46 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 16:11:55 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
Jul 29 17:23:57 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 17:24:59 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 17:30:46 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 17:47:24 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not permitted [root@devserver21 ~]# uptime 08:10:19 up 164 days, 9:56, 2 users, load average: 0.20, 0.54, 0.81 [root@devserver21 ~]#
Assumption: this is not from any kind of virtual machine. Assumption: Your local time server is NOT a GPS with an ovenized crystal or even a cell phone time source, i.e. NOT very stable. Assumption: the time servers that you are following (192.168.1.6[57]) are: a) each following the same timeserver(s), or at least have one in common. b) peering with one another c) following time servers that are reasonably stable. Assumption: the time farm is on real, non busy (an old cisco router serving as the internet connection to 1000+ computers does not qualify as non busy), hardware and is configured to archive maxpoll 10 or higher.
one problem that you have is that your timeserver farm (192.168.1.6[57]) is occasionally loosing its servers, i.e. we see "synchronized to LOCAL(0)" occasionally, which should not happen with a well configured time farm for hours to days, not minutes.
the second problem is that a machine which is not intended to be a time server is configured with a local clock with a stratum better than 15.
suggestion 1: 65 should have local clock at stratum 13, 66 and 67 should have local clock at stratum 14 or 15, all other machines should not have a local clock or should not have one with a stratum better than 15. Yes I, after reading the ntp documentation, disagree with RedHat's default. net result should be that you don't get any local clock loops in the setup because you have a defined leader, but if even the defined leader is lost the other machines should do a stable drift.
suggestion 2: 65, 66 & 67 should ALL peer with one another for added stability in the time farm.
suggestion 3: client machines should 'prefer' one of your servers over the others.
suggestion 4: see if someone has been messing with the kernel ticks on the machine... run `tickadj` file:///usr/share/doc/ntp-4.2.2p1/tickadj.html I had one computer where I needed to tweak the default value up or down one (I don't remember) to have it be real stable, this should be a last resort.
-----Original Message----- From: Todd Denniston Sent: Thursday, August 12, 2010 9:07 Jason Pyeron wrote, On 08/12/2010 08:01 AM:
-----Original Message----- From: Simon Billis Sent: Thursday, August 12, 2010 7:36
Jason Pyeron sent a missive on 2010-08-12:
We have a local time server and all of our machines are
pointed at it
for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date Fri Aug 13 14:43:29 EDT 2010 [root@devserver21 ~]# rdate -s 192.168.1.67 [root@devserver21 ~]# date Thu Aug 12 07:02:39 EDT 2010 [root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted
forward in time
if ntpd was running. Have a look at the logs /var/log/messages it should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages </snip> Jul 28 20:34:41 devserver21 ntpd[3475]: synchronized to
192.168.1.65, stratum
3 Jul 28 21:08:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:08:00 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:08:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 21:24:58 devserver21 ntpd[3475]: synchronized to 192.168.1.65,
stratum 3 Jul 28
21:41:26 devserver21 ntpd[3475]: synchronized to
192.168.1.67, stratum
3 Jul 28 21:42:16 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 21:42:16 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 21:42:34 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500
PPM Jul 28
21:43:37 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM
tolerance 500 PPM Jul 28 22:12:07 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:13:13 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:14:17 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
tolerance 500
PPM Jul 28 22:15:11 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 22:31:41 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 28 22:31:41 devserver21 ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM
Jul 29 15:14:01 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 15:26:05 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 15:59:17 devserver21
ntpd[3475]: time
reset -1.599691 s Jul 29 16:03:31 devserver21 ntpd[3475]:
synchronized
to LOCAL(0), stratum 10 Jul 29 16:05:38 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 Jul 29 16:08:46 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 16:11:55 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
Jul 29 17:23:57 devserver21 ntpd[3475]: synchronized to
192.168.1.67,
stratum 3 Jul 29 17:24:59 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Jul 29 17:30:46 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 Jul 29 17:47:24 devserver21 ntpd[3475]: synchronized to LOCAL(0), stratum 10 Aug 12 22:48:29 devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not
permitted
[root@devserver21 ~]# uptime 08:10:19 up 164 days, 9:56, 2 users, load average: 0.20, 0.54, 0.81 [root@devserver21 ~]#
Assumption: this is not from any kind of virtual machine.
Correct.
Assumption: Your local time server is NOT a GPS with an ovenized crystal or even a cell phone time source, i.e. NOT very stable.
Correct.
Assumption: the time servers that you are following (192.168.1.6[57]) are: a) each following the same timeserver(s), or at least have one in common.
192.168.1.6[567] are one machine. Time on that one is/has been good. Other machines in the enterprise follow it accurately.
b) peering with one another
n/a
c) following time servers that are reasonably stable.
Appears to be so.
Assumption: the time farm is on real, non busy (an old cisco router serving as the internet connection to 1000+ computers does not qualify as non busy), hardware and is configured to archive maxpoll 10 or higher.
Unknown, assuming the latency is neglibile. The important detail here is that all the machines in the lan have the same time. There is no unusual latency there.
one problem that you have is that your timeserver farm (192.168.1.6[57]) is occasionally loosing its servers, i.e. we see "synchronized to LOCAL(0)" occasionally, which should
That was on a ntp client, not the ntp server. Am I misunderstanting you?
not happen with a well configured time farm for hours to days, not minutes.
Agreed, see above.
the second problem is that a machine which is not intended to be a time server is configured with a local clock with a stratum better than 15.
I don't understand, I will have to read up more.
suggestion 1: 65 should have local clock at stratum 13, 66 and 67 should have local clock at stratum
They are presently one machine.
14 or 15, all other machines should not have a local clock or should not have one with a stratum better than 15. Yes I, after reading the ntp documentation, disagree with RedHat's default.
Ok.
net result should be that you don't get any local clock loops in the setup because you have a defined leader, but if even the defined leader is lost the other machines should do a stable drift.
suggestion 2: 65, 66 & 67 should ALL peer with one another for added stability in the time farm.
suggestion 3: client machines should 'prefer' one of your servers over the others.
suggestion 4: see if someone has been messing with the kernel ticks on the machine... run `tickadj` file:///usr/share/doc/ntp-4.2.2p1/tickadj.html
[root@devserver21 ~]# tickadj tick = 10000
I had one computer where I needed to tweak the default value up or down one (I don't remember) to have it be real stable, this should be a last resort.
-- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - - - Jason Pyeron PD Inc. http://www.pdinc.us - - Principal Consultant 10 West 24th Street #100 - - +1 (443) 269-1555 x333 Baltimore, Maryland 21218 - - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This message is copyright PD Inc, subject to license 20080407P00.
Jason Pyeron wrote, On 08/12/2010 09:27 AM:
-----Original Message----- From: Todd Denniston Sent: Thursday, August 12, 2010 9:07 Jason Pyeron wrote, On 08/12/2010 08:01 AM:
<SNIP>
Assumption: the time servers that you are following (192.168.1.6[57]) are: a) each following the same timeserver(s), or at least have one in common.
192.168.1.6[567] are one machine.
I am not sure how much trouble that fact alone is going to give you.
It at least explains why you constantly see the following repeating in your log 00:00:01 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 00:00:08 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 00:00:14 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 As each one of the VIPs "becomes better" ntp switches too it instead of stabilizing on one and then stabilizing the devserver21 system clock.
Time on that one is/has been good. Other machines in the enterprise follow it accurately.
yes&no... I suspect they would all do a better job following it, if you picked only one of it's IPs for them to use. By quarrying the same host but by different IPs I think you are messing up the integration/differentiation routines ntp tries to use.
one problem that you have is that your timeserver farm (192.168.1.6[57]) is occasionally loosing its servers, i.e. we see "synchronized to LOCAL(0)" occasionally, which should
That was on a ntp client, not the ntp server. Am I misunderstanting you?
Because the *client* was going back to "synchronized to LOCAL(0)", we then know the *server* is loosing it's servers and thus refuses to answer time requests, either that or a) the network between *this* client (devserver21) and the server (192.168.1.6[567]) is un-reliable. hardware, cables, network stacks, local RF generators... b) the triplet of IPs referring to one machine confuses the ntp client.
on the client try
for i in 65 66 67; do echo "data for $i" /usr/sbin/ntpdc -c 'showpeer 192.168.1.$i' | \ grep -e reach -e stratum done
and see what the reach, unreach and stratum are, especially during one of the 1 to 5 minute periods devserver21 is using local clock.
<SNIP>
the second problem is that a machine which is not intended to be a time server is configured with a local clock with a stratum better than 15.
I don't understand, I will have to read up more.
short way to say this: the machine you are asking for help on (devserver21), is intended to ONLY be a ntp client, and it should not ever offer time up to other machines if it is running on local clock. The way to make that happen is push the fudged stratum to 15.
suggestion 1: 65 should have local clock at stratum 13, 66 and 67 should have local clock at stratum
They are presently one machine.
Then that one ntp *server* machine (192.168.1.65) should be configured to have a local clock at stratum 13, for when it can not reach external clock but you still want all internal machines synced fairly close.
<SNIP>
In another email I thought you tried to indicate that your client machines refused to pickup time, if you only had one ntp server on the network. That _should_ not be true, unless the server: A) had not yet gotten it's own clock disciplined to an external clock, which can take 10 to 15 minutes the first time, and 8 to 10 after the drift file has been built if you are not using the iburst keyword. i.e., on the server `/usr/sbin/ntpdc -c kerninfo |grep ^status` needs to show "0001 pll" or B) has no external clock available at the time of test, and local clock is not defined on the ntp server(192.168.1.65) (or at low enough stratum), and it will still take 8 to 10 minutes (of connected to external or local clock time) from ntp startup before the server provides time.
Sorry for the embedded within embedded notes. :]
Hi Jason,
On Aug 12, 2010, at 8:01 AM, Jason Pyeron wrote:
Jul 28 21:42:34 devserver21 ntpd[3475]: frequency error -512 PPM exceeds
This shows that the system clock on devserver21 is driftin too fast for NTP to compensate.
Possible causes could be an out-of-spec crystal on that machines, or an error in the BIOS (bad frequency divider perhaps) for that particular model of machine.
Whatever the cause, it can be compensated for using "adjtimex". This package is not installed by default with Centos 5.5 but you can install it easily with "yum install adjtimex". Then read the man page "man 8 adjtimex" and the README in /usr/share/doc/adjtimex*/README and follow the directions from there (with particular reference to "adjtimex -c" or "adjtimex -a".)
Hope it helps!
Rick
On 8/12/2010 5:07 AM, Jason Pyeron wrote:
[root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65
Some HOWTOs tell you that more time servers is better, on a standard knee-jerk redundancy theory, but they're ignoring two things.
First, you already have a fallback: the system's built-in clock. It's perfectly fine to run on that while you ride out your time server's downtime.
Second, ntpd, internally, is built on a phase-locked loop, which is supposed to stabilize its time corrections in the face of jitter and other bad things out in the real world. Like anything based on a negative feedback loop, however, it can be destablized with certain inputs. Giving ntpd two or more servers is a pretty good way to destabilize its PLL in the real, non-ideal world we find on the modern Internet.
To anyone considering flaming me, please read this first:
http://queue.acm.org/detail.cfm?id=1773943
At minimum, read the section "One server is enough". The bit on PLLs about halfway down is also directly relevant.
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Warren Young Sent: Thursday, August 12, 2010 17:41 To: CentOS mailing list Subject: Re: [CentOS] Date drift and ntpd
On 8/12/2010 5:07 AM, Jason Pyeron wrote:
[root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65
Some HOWTOs tell you that more time servers is better, on a standard knee-jerk redundancy theory, but they're ignoring two things.
First, you already have a fallback: the system's built-in clock. It's perfectly fine to run on that while you ride out your time server's downtime.
Second, ntpd, internally, is built on a phase-locked loop, which is supposed to stabilize its time corrections in the face of jitter and other bad things out in the real world. Like anything based on a negative feedback loop, however, it can be destablized with certain inputs. Giving ntpd two or more servers is a pretty good way to destabilize its PLL in the real, non-ideal world we find on the modern Internet.
To anyone considering flaming me, please read this first:
http://queue.acm.org/detail.cfm?id=1773943
At minimum, read the section "One server is enough". The bit on PLLs about halfway down is also directly relevant.
Okay, I only have one timeserver, but the ntp clients cowardly refuse to use less than 3. Back to the man page...
-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - - - Jason Pyeron PD Inc. http://www.pdinc.us - - Principal Consultant 10 West 24th Street #100 - - +1 (443) 269-1555 x333 Baltimore, Maryland 21218 - - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This message is copyright PD Inc, subject to license 20080407P00.
On 8/12/2010 3:43 PM, Jason Pyeron wrote:
Okay, I only have one timeserver,
I meant that your on-site time server should be relying on only one other outside time server, one stratum up.
but the ntp clients cowardly refuse to use less than 3.
Only one server on a given LAN should be running ntpd. It's overkill for every machine to keep themselves synced with such a complex and fussy server. All the others should just call ntpdate or msntp every hour or so as a cron job to keep their own time close to that of the LAN time server's.
On 08/12/10 2:51 PM, Warren Young wrote:
Only one server on a given LAN should be running ntpd. It's overkill for every machine to keep themselves synced with such a complex and fussy server. All the others should just call ntpdate or msntp every hour or so as a cron job to keep their own time close to that of the LAN time server's.
I disagree.
Simply setting a systems time at fixed intervals will result in discontinuities in delta time measurements. if the systems local clock is fast, a given time will occur twice, and a delta between two time readings could be negative. if the clock is slow, a delta between two readings would jump by whatever correction.
I do think that having one NTP master server onsite which syncs either to a hardware clock (GPS etc), or to 1 to 3 external NTP servers, then having all the rest of your local servers sync to this one NTP master server is the correct architecture. Once an ntpd synchs and stabilizes to its reference, its very low overhead.
On 8/12/2010 4:15 PM, John R Pierce wrote:
On 08/12/10 2:51 PM, Warren Young wrote:
Only one server on a given LAN should be running ntpd. It's overkill for every machine to keep themselves synced with such a complex and fussy server. All the others should just call ntpdate or msntp every hour or so as a cron job to keep their own time close to that of the LAN time server's.
I disagree.
Simply setting a systems time at fixed intervals will result in discontinuities in delta time measurements. if the systems local clock is fast, a given time will occur twice, and a delta between two time readings could be negative. if the clock is slow, a delta between two readings would jump by whatever correction.
This is one of the points from the paper I referenced: there are three main uses for clocks, and a single implementation isn't appropriate for all uses. Only an ideal absolute time clock would work for all three cases. Since we don't have that, you have to consider your own case before deciding on a clock synchronization strategy.
The strategy I recommended is based on the fact that its worst case behavior (a small negative jump every hour) is not a problem for me. If it is a problem for your application, you need a different design.
Once an ntpd synchs and stabilizes to its reference, its very low overhead.
True only as long as it's being given stable time input. See figure 5 in the paper for the kind of wild, damped oscillations you get with ntpd when the input is not stable.
The time series plot is crystal clear, but don't overlook the fact that the IQR plots use different axes. There's a 4x difference hiding behind that bad visual display of quantitative information. (Yes, that's my inner Tufte you're seeing poking out there.)
Warren Young wrote:
The strategy I recommended is based on the fact that its worst case behavior (a small negative jump every hour) is not a problem for me. If it is a problem for your application, you need a different design.
It's a bad idea in the general case. If you have scheduled jobs, ntpdate may jump the clock enough to miss the trigger or run them twice, where ntpd always tries to move the clock fractional seconds at a time so as not to let that happen. Plus, ntpdate does no sanity check at all - if the clock source is badly off, the client will follow blindly even if it goes to the wrong century.
On 8/12/2010 8:03 PM, Les Mikesell wrote:
Warren Young wrote:
The strategy I recommended is based on the fact that its worst case behavior (a small negative jump every hour) is not a problem for me. If it is a problem for your application, you need a different design.
It's a bad idea in the general case. If you have scheduled jobs, ntpdate may jump the clock enough to miss the trigger or run them twice, where ntpd always tries to move the clock fractional seconds at a time so as not to let that happen. Plus, ntpdate does no sanity check at all - if the clock source is badly off, the client will follow blindly even if it goes to the wrong century.
Whereas ntpd will simply quietly fail to sync at all if it is more than a few minutes off. ;)
I've used ntpdate to keep exceptionally balky machines in phase before. If you do it frequently enough that the jump is never more than a second or two it works fine as long as you can tolerate the occasional out of order timestamp. Cron is sensitive only to the minute level and if you are paranoid about it, sync it at an odd time (something like 47 minutes after the hour) that just won't conflict with other cronjobs.
On 8/13/2010 4:06 PM, Jerry Franz wrote:
On 8/12/2010 8:03 PM, Les Mikesell wrote:
Warren Young wrote:
The strategy I recommended is based on the fact that its worst case behavior (a small negative jump every hour) is not a problem for me. If it is a problem for your application, you need a different design.
It's a bad idea in the general case. If you have scheduled jobs, ntpdate may jump the clock enough to miss the trigger or run them twice, where ntpd always tries to move the clock fractional seconds at a time so as not to let that happen. Plus, ntpdate does no sanity check at all - if the clock source is badly off, the client will follow blindly even if it goes to the wrong century.
Whereas ntpd will simply quietly fail to sync at all if it is more than a few minutes off. ;)
No, it should sync with up to an hour's difference, but it may take a while since it does fraction-second adjustments. The usual strategy (and the Centos default)is to use ntpdate once at startup to cover for bad hardware clocks or dead motherboard batteries, then start ntpd to keep the time correct.
I've used ntpdate to keep exceptionally balky machines in phase before.
Clients should never be 'balky' if you have a stable clock source - perhaps with the exception of some virtual machine situations or seriously bad hardware.
If you do it frequently enough that the jump is never more than a second or two it works fine as long as you can tolerate the occasional out of order timestamp. Cron is sensitive only to the minute level and if you are paranoid about it, sync it at an odd time (something like 47 minutes after the hour) that just won't conflict with other cronjobs.
That's better than nothing, but in most situations, ntp should "just work". I usually run it on routers and just use PCs as clients, though.
On Fri, 2010-08-13 at 09:47 -0500, Les Mikesell wrote:
On 8/13/2010 4:06 PM, Jerry Franz wrote:
On 8/12/2010 8:03 PM, Les Mikesell wrote:
Clients should never be 'balky' if you have a stable clock source - perhaps with the exception of some virtual machine situations or seriously bad hardware.
--- The Local Time Source is only as good as the motherboards oscillator crystal. It is dependent on how much tolerance, as in frequency the oscillator is as in PPM (parts per million). Some being as high as 10ppm for high end hardware. Some refer to this as jitter/latency. There is this same nagging problem within radio frequency aslo. Most radios are now within 2ppm as is ntp. The offset being in secs for ntp and hertz for radio.
http://www.eecis.udel.edu/~ntp/ntpfaq/NTP-s-sw-clocks.htm
John
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 13/08/2010 23:06, Jerry Franz wrote:
On 8/12/2010 8:03 PM, Les Mikesell wrote:
Warren Young wrote:
The strategy I recommended is based on the fact that its worst case behavior (a small negative jump every hour) is not a problem for me. If it is a problem for your application, you need a different design.
It's a bad idea in the general case. If you have scheduled jobs, ntpdate may jump the clock enough to miss the trigger or run them twice, where ntpd always tries to move the clock fractional seconds at a time so as not to let that happen. Plus, ntpdate does no sanity check at all - if the clock source is badly off, the client will follow blindly even if it goes to the wrong century.
Whereas ntpd will simply quietly fail to sync at all if it is more than a few minutes off. ;)
I believe it will not fail if you tell it to tinker panic 0
Regards, Markus
On 8/12/2010 9:03 PM, Les Mikesell wrote:
ntpd always tries to move the clock fractional seconds at a time
msntp does that, too, if you give the -a flag. (You have to give either -a or -r for it to change the system time at all.)
ntpdate also does this, as long as the delta is less than half a second, which it'd better be for an hourly sync schedule, since 0.5s/hr is 139 ppm.
ntpdate does no sanity check at all
msntp does. You can configure the min and max error it will correct. Its defaults are sane.
Jason Pyeron wrote:
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Warren Young Sent: Thursday, August 12, 2010 17:41 To: CentOS mailing list Subject: Re: [CentOS] Date drift and ntpd
On 8/12/2010 5:07 AM, Jason Pyeron wrote:
[root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ restrict default nomodify notrap noquery restrict 127.0.0.1 server 192.168.1.67 server 192.168.1.66 server 192.168.1.65
Some HOWTOs tell you that more time servers is better, on a standard knee-jerk redundancy theory, but they're ignoring two things.
First, you already have a fallback: the system's built-in clock. It's perfectly fine to run on that while you ride out your time server's downtime.
Second, ntpd, internally, is built on a phase-locked loop, which is supposed to stabilize its time corrections in the face of jitter and other bad things out in the real world. Like anything based on a negative feedback loop, however, it can be destablized with certain inputs. Giving ntpd two or more servers is a pretty good way to destabilize its PLL in the real, non-ideal world we find on the modern Internet.
To anyone considering flaming me, please read this first:
http://queue.acm.org/detail.cfm?id=1773943
At minimum, read the section "One server is enough". The bit on PLLs about halfway down is also directly relevant.
Okay, I only have one timeserver, but the ntp clients cowardly refuse to use less than 3. Back to the man page...
One server should be fine - you must have something else wrong, like your authoritative server not being a low stratum number - or not convinced itself that its time is correct.