[c-nsp] Sporadic loss of LDP neighbor ...

Robert Raszuk robert at raszuk.net
Mon Dec 12 03:16:50 EST 2011


Garry,

Do you see the same with "mpls ldp targeted-sessions" enabled (even for 
normal LDP p2p peers) ? At least this is something I would try first ...

Thx,
R.

> Hi *,
>
> I've been fighting this problem for quite a while, need some ideas from
> the collective intelligence ...
>
> On of our backbone locations has multiple routers that have worked fine
> for quite a while ... during the last couple months, we've been
> experiencing some sporadic failures in the LAN which I've not been able
> to pin-point any logical reason for ...
>
> Basic setup is this ... currently, three 7200 routers (2x NPE300 VXR
> [BB1&  2], 1x NPE150 [BB3] for a couple of L2TP wireless links). We've
> added an AS1002F [Core1] to that as new primary router for the location
> about a year ago (running a 300M link to our core uplink, 1G dark fiber
> link to another backbone location). All of our backbone is running with
> MPLS enabled (multiple VRFs for MPLS-VPNs). Everything fine up until
> something like 2-3 months ago (don't have an exact date, otherwise it
> might be easier to get some correlations to other changes in the configs
> or infrastructure). Then it started with sporadic losses of the LAN
> interconnections, like this: (log excerpt from BB2)
>
> Dec 11 22:59:31: %LDP-5-NBRCHG: LDP Neighbor [BB1]:0 is DOWN (Received
> error notification from peer: Holddown time expired)
> Dec 11 22:59:52: %LDP-5-NBRCHG: LDP Neighbor [BB3]:0 is DOWN (Discovery
> Hello Hold Timer expired)
> Dec 11 23:00:00: %LDP-5-NBRCHG: LDP Neighbor [BB3] is UP
> Dec 11 23:00:27: %LDP-5-NBRCHG: LDP Neighbor [BB1]:0 is UP
>
> These interruptions (at least the timestamps between down and up)
> sometimes only last 3-4 seconds, the BB1 one above with almost a minute
> is just about the longest I've seen to date. Of course this disrupts
> routing to a certain degree ... sometimes even bad enough to take down
> iBGP/eBGP multihop connections.
>
> Now, at two other backbone locations, we have more or less the identical
> setup, without any of these problems. I've already compared interface
> configs, but everything seems identical (apart from IP addresses of
> course). Problem here is that it's impossible to analyze any of the
> problem causes, as for one the problems occur without any predictable
> interval, and they're to short to react to the loss of connection in
> time ... I've tried activating some debugs on the router, but couldn't
> get any helpful information out of it (at least nothing I could identify)
>
> We've recently added an ASR1001 to the site, which (together with the
> 1002F) will be used to replace two 7200 routers, and already moved about
> half of the existing VLANs of the site (~20 of the 40+) to the ASRs.
> Didn't change much, though the interval of the interruptions went to
> maybe once every 2 or 3 days (from 1-2 per day). One thing I did notice
> is that mostly BB1 router is involved, with 1-2 times out of three BB2
> also losing LDP connection at the same time, and BB3 usually not showing
> any problems reaching either of the Core routers. BB1 and BB2 will also
> lose connectivity to each other most of the time, albeit not always. In
> attempting to locate the cause, we already moved BB1 to the same switch
> as Core1&2, with no results. Needless to say that there are no
> disruptions on Layer 2, at least not as far as could be seen in the logs.
>
> If these problems had manifested themselves when we installed the first
> ASR, I'd say it's something in the IOS versions that might be
> incompatible, but everything ran fine for something like 9 months, so
> that shouldn't be it. I've tried going through config diffs from 4-6
> months ago and now, but couldn't find any changes that should break MPLS
> on the LAN layer.
>
> Anybody have any idea at what might be causing this, or what I should
> check into to get to the cause of this problem?
>
> Here's some excerpts from the router configs:
>
> BB1:
> interface GigabitEthernet3/0
>   mtu 1500
>   no ip redirects
>   ip route-cache flow
>   negotiation auto
>   mpls label protocol ldp
>   tag-switching mtu 1520
>   tag-switching ip
>
> BB2: identical settings
>
> Core1:
> interface GigabitEthernet0/0/0
>   no ip redirects
>   ip flow ingress
>   negotiation auto
>   mpls ip
>   mpls label protocol ldp
>   mpls mtu 1520
>
> Thanks, Garry
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
>



More information about the cisco-nsp mailing list