Jason Pyeron wrote, On 08/12/2010 09:27 AM:
-----Original Message----- From: Todd Denniston Sent: Thursday, August 12, 2010 9:07 Jason Pyeron wrote, On 08/12/2010 08:01 AM:
<SNIP>
Assumption: the time servers that you are following (192.168.1.6[57]) are: a) each following the same timeserver(s), or at least have one in common.
192.168.1.6[567] are one machine.
I am not sure how much trouble that fact alone is going to give you.
It at least explains why you constantly see the following repeating in your log 00:00:01 devserver21 ntpd[3475]: synchronized to 192.168.1.65, stratum 3 00:00:08 devserver21 ntpd[3475]: synchronized to 192.168.1.66, stratum 3 00:00:14 devserver21 ntpd[3475]: synchronized to 192.168.1.67, stratum 3 As each one of the VIPs "becomes better" ntp switches too it instead of stabilizing on one and then stabilizing the devserver21 system clock.
Time on that one is/has been good. Other machines in the enterprise follow it accurately.
yes&no... I suspect they would all do a better job following it, if you picked only one of it's IPs for them to use. By quarrying the same host but by different IPs I think you are messing up the integration/differentiation routines ntp tries to use.
one problem that you have is that your timeserver farm (192.168.1.6[57]) is occasionally loosing its servers, i.e. we see "synchronized to LOCAL(0)" occasionally, which should
That was on a ntp client, not the ntp server. Am I misunderstanting you?
Because the *client* was going back to "synchronized to LOCAL(0)", we then know the *server* is loosing it's servers and thus refuses to answer time requests, either that or a) the network between *this* client (devserver21) and the server (192.168.1.6[567]) is un-reliable. hardware, cables, network stacks, local RF generators... b) the triplet of IPs referring to one machine confuses the ntp client.
on the client try
for i in 65 66 67; do echo "data for $i" /usr/sbin/ntpdc -c 'showpeer 192.168.1.$i' | \ grep -e reach -e stratum done
and see what the reach, unreach and stratum are, especially during one of the 1 to 5 minute periods devserver21 is using local clock.
<SNIP>
the second problem is that a machine which is not intended to be a time server is configured with a local clock with a stratum better than 15.
I don't understand, I will have to read up more.
short way to say this: the machine you are asking for help on (devserver21), is intended to ONLY be a ntp client, and it should not ever offer time up to other machines if it is running on local clock. The way to make that happen is push the fudged stratum to 15.
suggestion 1: 65 should have local clock at stratum 13, 66 and 67 should have local clock at stratum
They are presently one machine.
Then that one ntp *server* machine (192.168.1.65) should be configured to have a local clock at stratum 13, for when it can not reach external clock but you still want all internal machines synced fairly close.
<SNIP>
In another email I thought you tried to indicate that your client machines refused to pickup time, if you only had one ntp server on the network. That _should_ not be true, unless the server: A) had not yet gotten it's own clock disciplined to an external clock, which can take 10 to 15 minutes the first time, and 8 to 10 after the drift file has been built if you are not using the iburst keyword. i.e., on the server `/usr/sbin/ntpdc -c kerninfo |grep ^status` needs to show "0001 pll" or B) has no external clock available at the time of test, and local clock is not defined on the ntp server(192.168.1.65) (or at low enough stratum), and it will still take 8 to 10 minutes (of connected to external or local clock time) from ntp startup before the server provides time.
Sorry for the embedded within embedded notes. :]