[outages] Fwd: NTP Issues Today

Jeremy Chadwick jdc at koitsu.org
Mon Nov 19 23:01:41 EST 2012


It would help if folks could provide some actual timestamps from logs as
to when they saw the time hit odd values (either epoch (i.e. 1970) or
year 2000), both before and after.  The more granular the better.

This is making me wonder if some NTP somewhere has some protocol length
limitation that was hit and rolled over.  Yes, something akin to epoch,
but not quite the same.

I've seen this kind of design oddity happen in the past, just not with
NTP.  It happened with FreeBSD's ZFS port, where performance would turn
to crap at about the 24 day mark due to something called "LBOLT":

http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html

I never took the time to read the entire code and understand it -- these
internals are way too complex -- but I'm left with the impression it
pertains to high-resolution clock ticks or actual hertz (vs. per-second
granularity).

I don't want people getting spun up thinking NTP has some protocol
design issue or issue similar to this -- I have no way to prove that --
but I imagine NTP is designed similarly (e.g. with sub-second
granularity; I know this to be true because NTP adjusts clocks gradually
in very small increments (sub-second), not 1+ second amounts; the latter
would be catastrophic to way too much software, which is why you *do
not* sync your clock by calling ntpdate from crontab!!).

I don't see anything on the NTP mailing lists recently that indicate
there was a major incident of sorts:

http://lists.ntp.org/listinfo

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

On Mon, Nov 19, 2012 at 10:38:17PM -0500, Josh Luthman wrote:
> Ntpd crashed for me early afternoon.  Log said it wanted to go back 12
> years.
> 
> Josh Luthman
> Office: 937-552-2340
> Direct: 937-552-2343
> 1100 Wayne St
> Suite 1337
> Troy, OH 45373
> On Nov 19, 2012 6:05 PM, "Jeremy Chadwick" <jdc at koitsu.org> wrote:
> 
> > Quite scary, for the following reason:
> >
> > $ ntpq -c peers tick.usno.navy.mil
> >      remote           refid      st t when poll reach   delay   offset
> >  jitter
> >
> > ==============================================================================
> > *REFCLK(45,0)    .IRIG.           0 l    9   16  377    0.000   -0.001
> > 0.001
> > +REFCLK(45,1)    .IRIG.           0 l    8   16  377    0.000   -0.006
> > 0.001
> > +10.1.4.40       .IRIG.           1 u   22   64  377    0.146    0.024
> > 0.013
> > -10.1.4.51       0.0.0.0          2 u   48   64  377    1.516   -0.228
> > 0.301
> >
> > $ ntpq -c peers tock.usno.navy.mil
> >      remote           refid      st t when poll reach   delay   offset
> >  jitter
> >
> > ==============================================================================
> > *REFCLK(45,0)    .IRIG.           0 l    3   16  377    0.000   -0.001
> > 0.000
> > +REFCLK(45,1)    .IRIG.           0 l    2   16  377    0.000   -0.007
> > 0.000
> > +10.1.4.40       .IRIG.           1 u    6   64  377    0.116    0.029
> > 0.017
> > -10.1.4.51       0.0.0.0          2 u   65   64  377    1.467    0.080
> > 0.182
> >
> > Note that both of their servers prefer to sync off of local hardware
> > time devices (specifically dedicated hardware with IRIG/AFNOR TCRs in
> > them).
> >
> > I'd love if one of their admins could explain what happened, for
> > educational benefit at bare minimum.
> >
> > --
> > | Jeremy Chadwick                                   jdc at koitsu.org |
> > | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> > | Mountain View, CA, US                                            |
> > | Making life hard for others since 1977.             PGP 4BD6C0CB |
> >
> > On Mon, Nov 19, 2012 at 03:49:56PM -0800, George Herbert wrote:
> > > Apparently the US Navy was partying like it's 1999 earlier today...
> > >
> > > -george
> > >
> > >
> > > ---------- Forwarded message ----------
> > > From: Clay Haynes <chaynes at centracomm.net>
> > > Date: Mon, Nov 19, 2012 at 3:37 PM
> > > Subject: Re: NTP Issues Today
> > > To: "surfer at mauigateway.com" <surfer at mauigateway.com>,
> > > "nanog at nanog.org" <nanog at nanog.org>
> > >
> > >
> > > Scott,
> > > I can confirm this had happened on one of my test servers - it was
> > > pointing to tick.usno.navy.mil and tock.usno.navy.mil at the time.
> > >
> > >
> > > - Clay
> > >
> > >
> > >
> > > On 11/19/12 6:32 PM, "Scott Weeks" <surfer at mauigateway.com> wrote:
> > >
> > > >
> > > >
> > > >--- vanwolfe at gmail.com wrote:
> > > >From: Van Wolfe <vanwolfe at gmail.com>
> > > >
> > > >Did anyone else experience issues with NTP today?  We had our server
> > > >times update to the year 2000 at around 3:30 MT, then revert back to
> > 2012.
> > > >-----------------------------------------
> > > >
> > > >
> > > >You need to provide more information.  For example, what NTP
> > > >source are you using?
> > > >
> > > >scott
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > -george william herbert
> > > george.herbert at gmail.com
> > > _______________________________________________
> > > Outages mailing list
> > > Outages at outages.org
> > > https://puck.nether.net/mailman/listinfo/outages
> > _______________________________________________
> > Outages mailing list
> > Outages at outages.org
> > https://puck.nether.net/mailman/listinfo/outages
> >



More information about the Outages mailing list