[c-nsp] NTP synchronization problems C2801

Tue Jun 29 18:37:18 EDT 2010

You need 3, preferably 4  NTP sources so that clients will work correctly. If you have 2, how does it know which one is a better source of time? 3 gives you a quorum, but if 1 fails, then you are back to 2. Four is the magic number.

Since you have 2 setup as strata 1, setup two boxes that use ntp sources on the internet as stratum 2 devices. Then set your cisco boxes to the 4 clients. That should make them more stable.

-----Original Message-----
From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Peter Rathlev
Sent: Tuesday, June 29, 2010 6:13 PM
To: cisco-nsp at puck.nether.net
Subject: [c-nsp] NTP synchronization problems C2801

I have two devices where one keeps itself synchronized to within 5-10
usec of the NTP servers but the other one varies _wildly_, sometimes
having an offset of ~150 ms. The NTP servers are two CentOS 5.4 servers,
themselves using two Meinberg M300 GPS devices as stratum 1 sources.

Working device: C2801 running 12.4(24)T3 Enterprise Base, 128 MB RAM.
Connected to a 3560E as an access device, Fa0/0 has an IP address and a
default route pointing at the gateway for that access VLAN. NTP update
source is this interface.

Non-working device: C2801 running 15.0(1)M2 Enterprise Service, 256 MB
RAM. Connected redundantly to two 6500/Sup720s on L3 interfaces. Running
IS-IS and MPLS. NTP update source is a Loopback interface.

None of the devices are CPU or traffic loaded in any serious way (30%
peaks about once per hour, otherwise <3%). Both devices are supposed to
be doing IP SLA collection.

NTP configuration is the same, with just the two servers configured and
an update source for one device. (I've cleared the auth config to see if
that was it, no change.)

The non-working device:

 non_working#sh ntp status
 Clock is synchronized, stratum 3, reference is 10.85.247.20  
 nominal freq is 250.0000 Hz, actual freq is 250.0181 Hz, precision is 2**24
 reference time is CFD496F6.08BA94AC (17:59:50.034 CEST Tue Jun 29 2010)
 clock offset is -27.2854 msec, root delay is 3.51 msec
 root dispersion is 50.75 msec, peer dispersion is 1.26 msec
 loopfilter state is 'CTRL' (Normal Controlled Loop), drift is -0.000072382 s/s
 system poll interval is 16, last update was 70 sec ago.

 non_working#sh ntp ass 

   address         ref clock       st   when   poll reach  delay  offset   disp
 *~10.85.247.20    10.83.8.130      2      8     16   377  1.778 -27.285  1.116
 +~10.83.247.20    10.83.8.130      2      0     16   377  0.877 -26.844  1.302
  * sys.peer, # selected, + candidate, - outlyer, x falseticker, ~ configured
 non_working#

And the working device:

 working#sh ntp status
 Clock is synchronized, stratum 3, reference is 10.83.247.20
 nominal freq is 250.0000 Hz, actual freq is 250.0274 Hz, precision is 2**24
 reference time is CFD49516.7C7CB895 (17:51:50.486 CEST Tue Jun 29 2010)
 clock offset is -0.0051 msec, root delay is 0.00 msec
 root dispersion is 0.04 msec, peer dispersion is 0.00 msec
 loopfilter state is 'CTRL' (Normal Controlled Loop), drift is -0.000109809 s/s
 system poll interval is 256, last update was 564 sec ago.
 working#sh ntp ass

   address         ref clock       st   when   poll reach  delay  offset   disp
 +~10.85.247.20    10.83.8.130      2    233    256   377  1.759  -5.567  8.367
 *~10.83.247.20    10.83.8.130      2    232    256   377  0.877  -5.194  7.547
  * sys.peer, # selected, + candidate, - outlyer, x falseticker, ~ configured
 working#

I tried many things, among others to fix the maxpoll interval, adjust
scheduler allocation, disable authentication, reset the configuration
completely (with "no ntp <cr>") and reset the clock to when Prince was
still young to force a "stepping". Still the 15.0(1)M2 Ent Serv. simply
cannot keeps a precise time.

Reachability is perfect (377 all the time) and the device can even see
the root offset! It just cannot adjust correctly. I know it takes time
(why can't I have an explicit "step" command in IOS?) but the offset
varies, also getting worse over time.

Is it because of the extra MPLS encapsulation? The newer IOS version?
Another feature set? Or should I just give up? :-)

-- 
Peter

_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/