[c-nsp] IS-IS LSP Generation/Expiry + Database Optimization - Issue

Mark Tinka mtinka at globaltransit.net
Sun Feb 22 10:30:42 EST 2009


On Sunday 22 February 2009 06:41:35 pm Oliver Boehmer 
(oboehmer) wrote:

Appreciate the feedback, Oli.

Comments inline.

> I've "worked" with the increased lifetime/refresh
> intervals in several large networks for the last 8 years,
> and I've not seen an issue with it. Do you have any
> indication that the problem you've been experiencing is
> caused by "corrupt" LSPs?

Admittedly, we haven't sat down to really analyze and debug 
the flow of LSP's (or lack thereof), as each time it 
happens, we can't afford this luxury; the router has to be 
online in the shortest time possible (and I can't replicate 
this exactly in the lab as we don't have enough of the exact 
spare kit to do so at the moment).

That said, we only see the issue on recovering routers. We 
do not see it on new routers that are being connected to the 
network for the first time (i.e., they didn't have pre-
existing LSP's in the DIS's link state database), which 
makes sense.

One would imagine that a recovering router is tantamount to 
hard resetting the IS-IS process, thereby flooding fresh 
copies of the LSP's to the DIS, but this seems NOT to be the 
case. A manual hard reset is still required to update the 
local link state database.

> It is strange that you only
> seem to see the problem on some routers, and not on
> others, which makes a "corrupt" LSP advertised by the
> restarting router a bit unlikely..

We've only seen the issue on recovering routers that were 
previously part of the IS-IS domain. As mentioned, routers 
that are new to the domain come up fine the first time.

The consistency of whether it will be a v4 address or v6 
address missing from the network is not certain (it's 
random). But the consistency that any of the recovering 
routers will have a problem establishing all 4 iBGP sessions 
to the route reflectors (2x for v4 + 2x for v6) is certain, 
so far.

Suffice it to say, all IS's and DIS's are running the same 
code. When we see the issue, it's almost always that only 
75% of the iBGP sessions have formed - either one v4 session 
or one v6 session is down, due to lack of reachability 
information for it in IS-IS.

> I would still recommend the higher lifetime values,
> however the original reason (reducing the "chatter") is
> certainly much less important these days with high-speed
> CPU and links, so I'm not passionate about it..

Clearly, even though we did reduce the lifetime and refresh 
timers, we would still need to wait "that long" before the 
link database is cleaned out. And since we need the 
restarting router to be firing on all cylinders when it 
returns to the network, it doesn't matter whether the 
database will be refreshed in 18 minutes or 18 hours - we 
need uptime the moment the router is able to start 
processing frames/packets.

So in that respect, keeping these values at "where ever" 
they need to be to scale IS-IS is fine. We just need to 
figure out why the recovering router does not "properly" 
signal the DIS to refresh it's link state database upon a 
successful initialization of the IS-IS process.

I will say that we have the 'ignore-lsp-errors' feature 
enabled. Given its purpose, could that have an adverse 
effect on a recovering router's capability to effectively 
get its new LSP's out to the DIS?

Cheers,

Mark.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: <https://puck.nether.net/pipermail/cisco-nsp/attachments/20090222/ba1c3097/attachment-0001.bin>


More information about the cisco-nsp mailing list