[c-nsp] MPLS "Tag Control" process - what does this do?
Rodney Dunn
rodunn at cisco.com
Fri Aug 3 08:33:08 EDT 2007
> >
> >'sh ip route summ'.
>
> 370flinders-r200.32-pe01#show ip route summary
> IP routing table name is Default-IP-Routing-Table(0)
> Route Source Networks Subnets Overhead Memory (bytes)
> connected 0 58 4228 8816
> static 1 79 5808 14200
> eigrp 9176 0 0 0 0
> bgp 9176 42568 8972 6430852 7838160
> External: 51540 Internal: 0 Local: 0
> ospf 9176 24 573 38336 94824
> Intra-area: 8 Inter-area: 5 External-1: 583 External-2: 1
> NSSA External-1: 0 NSSA External-2: 0
> internal 92 107824
> Total 42685 9682 6479224 8063824
> 370flinders-r200.32-pe01#
Ok...not that many IGP routes that labels would have to be allocated
for.
Now, if you learned the full BGP feed over a MPLS path you would
have to recurse all those BGP prefixes to the IGP next hop which
would put it in the MPLS path. That recursion can sometimes drive
some of the CEF and TAC control processes up.
But if the BGP feed is over an IP path it would not impact the MPLS side.
>
> There's a full BGP internet feed coming in but it's filtered to the 42000
> routes as seen above (I don't know why, it's an ISP network we have
> recently taken over management of so there are still some unanswered
> questions). The 7200 has 1G of DRAM.
Ok..
>
> LDP. There are two routers attached to the far end peer which are running
> TDP still but these are not directly connected to this router I sent the
> logs from.
>
> I am intending to change these over to LDP soon so that the whole network
> is just using LDP throughout and not a mixture of the two.
>
> >What is the peer and code?
>
> It is also a 7200 NPE-G1 with 12.3(19). There are three core routers - all
> the same - linked via ATM in a triangle/meshed MPLS topology. The other
> two seemed to be OK.
hmm...I wonder if they changed it in LDP and no TDP to make sure a withdraw
is sent before a new advertisement. But the above message seems to imply
you got a label and the route was missing.
Sounds like the routes may have been flapping possible.
'sh ip route' would tell you how old they were.
>
> >I looked around a bit to try and understand that messge.
> >Seems it has to do with getting a label for a prefix we don't have.
>
> One of the support guys said something about the router dropping it's CEF
> table. Unfortunately I didn't check that at the time, my concern was more
> on why so much CPU was being burnt up on that one process.
>
>
> >Problem Description:
> >====================
> >Problem: LDP does not withdraw label before announcing a new label for
> > the same FEC.
> >
> >Solution:
> >To fix this problem a new config command is introduced:
> > [no] mpls ldp neighbor A.B.C.D implicit-withdraw-label
> >
> >default Behavior:
> >It will follow the LDP standard i.e. LDP will withdraw previously
> >advertised label before advertising a new label for a FEC. When
> >"mpls ldp neighbor A.B.C.D implicit-withdraw-label" is configured
> >LDP will not withdraw the previous label before advertising a new
> >label.
> >
> >default behavior is changed. Now when there is a need to change
> >label for a FEC:
> >1. LDP will send a Label withdraw and then after receiving Label
> >Release, it will advertise the new binding with a Label Mapping. If
> >after sending Label Withdraw, no Label Release is received from a peer(s)
> >and sufficient time (Currently set to 5 minutes) has passed, then LDP
> >will assume that peer(s) is not capable of sending a label release and it
> >will send Label Mapping to the peer.
> >
> >2. LDP maintains a list of previous labels for which a Label Release is
> >awaited from any peer.
> >A new Label Mapping for a FEC is not announced to a peer if a Label
> >release for the same FEC is pending from the peer.
> >
> >that was changes that went in under:
> >
> >CSCdv74248
> >Externally found enhancement defect: Resolved (R)
> >LDP session drop after receiving a new label for the same FEC
> >
> >that you would have in 12.3(19). But what about the peering router?
>
> See above.
>
> >Rodney
>
> I've also just noticed that the 3550 behind this router and another one
> connected to that via an ethernet link to that switch also is pretty sick,
> and both nearly ran out of memory today. Both have logged a lot of
> messages about "Aug 3 11:51:39: %FIB-2-FIBDOWN: CEF has been disabled due
> to a low memory condition. It can be re-enabled by configuring "ip cef
> [distributed]"
ooppss....that means you probably are low or have a leak. And during the
convergence event you used enough transient memory to tip it over the
edge. That's a problem you need to get fixed.
>
> 'show ip cef' on these switches shows that CEF is now not running. My
> thinking is that this is a rather big problem in itself (!) and will try
> get these switches reloaded later tonight.
Yeah..after you reload look at the free memory. If it's low right after
a reload you don't have enough memory. If it decreases over time you have
a leak that needs to be debugged.
>
> The reason I am bringing this up is that I'm considering if the problem may
> not have been directly caused by this 7200, but may be caused by some other
> external factor. The only thing which strikes me as a possibility is that
> someone or something flooded/redistributed an entire BGP feed into OSPF.
> Does that sound like a possibility?
Yep. Been there seen that more than once. :)
Without snapshots of the routing table it's hard to say.
If you didn't reboot this 72xx was does the loweest show in 'sh mem stat'?
That would show you if you ran it really low at some point.
>
> Still doesn't answer quite why the 7200 was chewing so much cpu though
> <scratches head>.
Got a ton of routes or the routes were churning a lot. Or you got a slew
of label advertisements from the tdp/ldp peers most likely.
Rodney
>
> Reuben
More information about the cisco-nsp
mailing list