Re: [CentOS] Date drift and ntpd

12 Aug 2010


      ...
-----Original Message-----
From: Todd Denniston
Sent: Thursday, August 12, 2010 9:07
Jason Pyeron wrote, On 08/12/2010 08:01 AM:
...
...
-----Original Message-----
From: Simon Billis
Sent: Thursday, August 12, 2010 7:36
Jason Pyeron sent a missive on 2010-08-12:
...
We have a local time server and all of our machines are
pointed at it
...
for the time.
How can the clock drift by a day and a half?
[root@devserver21 ~]# date
Fri Aug 13 14:43:29 EDT 2010
[root@devserver21 ~]# rdate -s 192.168.1.67
[root@devserver21 ~]# date
Thu Aug 12 07:02:39 EDT 2010
[root@devserver21 ~]# cat /etc/ntp.conf | grep -v ^# | grep -v ^$ 
restrict default nomodify notrap noquery restrict 127.0.0.1 server
192.168.1.67 server 192.168.1.66 server 192.168.1.65
server  127.127.1.0     # local clock
fudge   127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay  0.008
keys            /etc/ntp/keys
Hi,
It is unlikely that the machine in question drifted
forward in time
...
...
if ntpd was running. Have a look at the logs /var/log/messages it 
should contain the ntpd log messages
[root@devserver21 ~]# grep ntpd /var/log/messages </snip> Jul 28 
20:34:41 devserver21 ntpd[3475]: synchronized to
192.168.1.65, stratum
...
3 Jul 28 21:08:00 devserver21 ntpd[3475]: synchronized to LOCAL(0), 
stratum 10 Jul 28 21:08:00 devserver21 ntpd[3475]: frequency error 
-512 PPM exceeds tolerance 500 PPM Jul 28 21:08:11 devserver21 
ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 28 21:24:58 
devserver21 ntpd[3475]: synchronized to 192.168.1.65,
stratum 3 Jul 28
...
21:41:26 devserver21 ntpd[3475]: synchronized to
192.168.1.67, stratum
...
3 Jul 28 21:42:16 devserver21 ntpd[3475]: synchronized to LOCAL(0), 
stratum 10 Jul 28 21:42:16 devserver21 ntpd[3475]: frequency error 
-512 PPM exceeds tolerance 500 PPM Jul 28 21:42:34 devserver21 
ntpd[3475]: frequency error -512 PPM exceeds tolerance 500
PPM Jul 28
...
21:43:37 devserver21 ntpd[3475]: frequency error -512 PPM exceeds 
tolerance 500 PPM
...
tolerance 500 PPM
Jul 28 22:12:07 devserver21 ntpd[3475]: frequency error -512 PPM 
exceeds tolerance 500 PPM Jul 28 22:13:13 devserver21 ntpd[3475]: 
frequency error -512 PPM exceeds tolerance 500 PPM Jul 28 22:14:17 
devserver21 ntpd[3475]: frequency error -512 PPM exceeds
tolerance 500
...
PPM Jul 28 22:15:11 devserver21 ntpd[3475]: synchronized to 
192.168.1.66, stratum 3 Jul 28 22:31:41 devserver21 ntpd[3475]: 
synchronized to LOCAL(0), stratum 10 Jul 28 22:31:41 devserver21 
ntpd[3475]: frequency error -512 PPM exceeds tolerance 500 PPM
...
Jul 29 15:14:01 devserver21 ntpd[3475]: synchronized to LOCAL(0), 
stratum 10 Jul 29 15:26:05 devserver21 ntpd[3475]: synchronized to 
192.168.1.65, stratum 3 Jul 29 15:59:17 devserver21
ntpd[3475]: time
...
reset -1.599691 s Jul 29 16:03:31 devserver21 ntpd[3475]:
synchronized
...
to LOCAL(0), stratum 10 Jul 29 16:05:38 devserver21 ntpd[3475]: 
synchronized to 192.168.1.67, stratum 3 Jul 29 16:08:46 devserver21 
ntpd[3475]: synchronized to 192.168.1.66, stratum 3 Jul 29 16:11:55 
devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3
...
Jul 29 17:23:57 devserver21 ntpd[3475]: synchronized to
192.168.1.67,
...
stratum 3 Jul 29 17:24:59 devserver21 ntpd[3475]: synchronized to 
LOCAL(0), stratum 10 Jul 29 17:30:46 devserver21 ntpd[3475]: 
synchronized to 192.168.1.65, stratum 3 Jul 29 17:47:24 devserver21 
ntpd[3475]: synchronized to LOCAL(0), stratum 10 Aug 12 22:48:29 
devserver21 ntpd[3475]: sendto(192.168.1.66): Operation not
permitted
...
[root@devserver21 ~]# uptime
 08:10:19 up 164 days,  9:56,  2 users,  load average: 0.20, 0.54, 
0.81
[root@devserver21 ~]#
Assumption: this is not from any kind of virtual machine.
Correct.
...
Assumption: Your local time server is NOT a GPS with an 
ovenized crystal or even a cell phone time source, i.e. NOT 
very stable.
Correct.
...
Assumption: the time servers that you are following 
(192.168.1.6[57]) are:
   a) each following the same timeserver(s), or at least 
have one in common.
192.168.1.6[567] are one machine. Time on that one is/has been good. Other
machines in the enterprise follow it accurately.
...
b) peering with one another
n/a
...
c) following time servers that are reasonably stable.
Appears to be so.
...
Assumption: the time farm is on real, non busy (an old cisco 
router serving as the internet connection to 1000+ computers 
does not qualify as non busy), hardware and is configured to 
archive maxpoll 10 or higher.
Unknown, assuming the latency is neglibile. The important detail here is that
all the machines in the lan have the same time. There is no unusual latency
there.
...
one problem that you have is that your timeserver farm 
(192.168.1.6[57]) is occasionally loosing its servers, i.e. 
we see "synchronized to LOCAL(0)" occasionally, which should
That was on a ntp client, not the ntp server. Am I misunderstanting you?
...
not happen with a well configured time farm for hours to 
days, not minutes.
Agreed, see above.
...
the second problem is that a machine which is not intended to 
be a time server is configured with a local clock with a 
stratum better than 15.
I don't understand, I will have to read up more.
...
suggestion 1: 65 should have local clock at stratum 13, 66 
and 67 should have local clock at stratum
They are presently one machine.
...
14 or 15, all other machines should not have a local clock or 
should not have one with a stratum better than 15. Yes I, 
after reading the ntp documentation, disagree with RedHat's default.
Ok.
...
net result should be that you don't get any local clock loops 
in the setup because you have a defined leader, but if even 
the defined leader is lost the other machines should do a 
stable drift.
suggestion 2: 65, 66 & 67 should ALL peer with one another 
for added stability in the time farm.
suggestion 3: client machines should 'prefer' one of your 
servers over the others.
suggestion 4: see if someone has been messing with the kernel 
ticks on the machine...
run `tickadj` file:///usr/share/doc/ntp-4.2.2p1/tickadj.html
[root@devserver21 ~]# tickadj
tick = 10000
...
I had one computer where I needed to tweak the default value 
up or down one (I don't remember) to have it be real stable, 
this should be a last resort.
--
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane) 
Harnessing the Power of Technology for the Warfighter 
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-                                                               -
- Jason Pyeron                      PD Inc. http://www.pdinc.us -
- Principal Consultant              10 West 24th Street #100    -
- +1 (443) 269-1555 x333            Baltimore, Maryland 21218   -
-                                                               -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This message is copyright PD Inc, subject to license 20080407P00.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] Date drift and ntpd