[c-nsp] Reasons for "random" ISIS flapping?

Saku Ytti saku at ytti.fi
Wed Aug 7 04:47:27 EDT 2013


On (2013-08-07 09:40 +0200), Peter Rathlev wrote:

>    1) The "CLNS-5-ADJCHANGE" states a reason at the end of the log
>       message. This reason seems to be "hold time expired" for the
>       device at the center of the event and "neighbor forgot us"
>       for all the neighbors. What's the difference between these
>       two? The system message documentation[0] isn't really
>       helpful, but maybe I'm looking in the wrong place.

Hold time expired I guess is obvious, the local timer you keep for your
neighbours ran out and you determined your neighbour to be dead.

Neighbour forgot us is consequence of above. Consider A-B-C-A. A
experiences 'hold time expired' towards B, it'll then reflood LSP to C
where it no longer reports having connectivity to B. 
Once B gets this LSP from C, B is notices that it sill has session to A,
but A claims it has no session to B, so it complies with As view of the
situation and tears down the session to A.

>    2) The column "duration" from "show isis spf-log" is
>       milliseconds, right? Not seconds? This column normally shows
>       0 for PERIODIC events and maybe 4 or 8 for any event on
>       other devices. On the affected device this show 20 for the
>       "DELADJ TLVCONTENT" event. Is that bad enough to warrant
>       further investigation?

It's milliseconds, they're all very small values.


So mostly we need to worry 'Why is A not receiving ISIS packets from its
neighbours'. 
 a) are you seeing input drops in the hold-queue? (try 1k or even 4k hold-queue input)
 b) is it busy running some other process? (try process-max-time 60)
 c) is it software defect


I also couldn't help noticing you're running L1, why is this? It seems to
be quite rare these days, you really have separate core L2 and various L1
islands?

-- 
  ++ytti


More information about the cisco-nsp mailing list