[c-nsp] IS-IS LSP Generation/Expiry + Database Optimization - Issue
Mark Tinka
mtinka at globaltransit.net
Sun Feb 22 10:30:42 EST 2009
On Sunday 22 February 2009 06:41:35 pm Oliver Boehmer
(oboehmer) wrote:
Appreciate the feedback, Oli.
Comments inline.
> I've "worked" with the increased lifetime/refresh
> intervals in several large networks for the last 8 years,
> and I've not seen an issue with it. Do you have any
> indication that the problem you've been experiencing is
> caused by "corrupt" LSPs?
Admittedly, we haven't sat down to really analyze and debug
the flow of LSP's (or lack thereof), as each time it
happens, we can't afford this luxury; the router has to be
online in the shortest time possible (and I can't replicate
this exactly in the lab as we don't have enough of the exact
spare kit to do so at the moment).
That said, we only see the issue on recovering routers. We
do not see it on new routers that are being connected to the
network for the first time (i.e., they didn't have pre-
existing LSP's in the DIS's link state database), which
makes sense.
One would imagine that a recovering router is tantamount to
hard resetting the IS-IS process, thereby flooding fresh
copies of the LSP's to the DIS, but this seems NOT to be the
case. A manual hard reset is still required to update the
local link state database.
> It is strange that you only
> seem to see the problem on some routers, and not on
> others, which makes a "corrupt" LSP advertised by the
> restarting router a bit unlikely..
We've only seen the issue on recovering routers that were
previously part of the IS-IS domain. As mentioned, routers
that are new to the domain come up fine the first time.
The consistency of whether it will be a v4 address or v6
address missing from the network is not certain (it's
random). But the consistency that any of the recovering
routers will have a problem establishing all 4 iBGP sessions
to the route reflectors (2x for v4 + 2x for v6) is certain,
so far.
Suffice it to say, all IS's and DIS's are running the same
code. When we see the issue, it's almost always that only
75% of the iBGP sessions have formed - either one v4 session
or one v6 session is down, due to lack of reachability
information for it in IS-IS.
> I would still recommend the higher lifetime values,
> however the original reason (reducing the "chatter") is
> certainly much less important these days with high-speed
> CPU and links, so I'm not passionate about it..
Clearly, even though we did reduce the lifetime and refresh
timers, we would still need to wait "that long" before the
link database is cleaned out. And since we need the
restarting router to be firing on all cylinders when it
returns to the network, it doesn't matter whether the
database will be refreshed in 18 minutes or 18 hours - we
need uptime the moment the router is able to start
processing frames/packets.
So in that respect, keeping these values at "where ever"
they need to be to scale IS-IS is fine. We just need to
figure out why the recovering router does not "properly"
signal the DIS to refresh it's link state database upon a
successful initialization of the IS-IS process.
I will say that we have the 'ignore-lsp-errors' feature
enabled. Given its purpose, could that have an adverse
effect on a recovering router's capability to effectively
get its new LSP's out to the DIS?
Cheers,
Mark.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: <https://puck.nether.net/pipermail/cisco-nsp/attachments/20090222/ba1c3097/attachment-0001.bin>
More information about the cisco-nsp
mailing list