[c-nsp] BGP Path Selection and next-hop reachability (IGP vs BGP)

John Neiberger jneiberger at gmail.com
Fri Nov 30 22:15:28 EST 2012


I thought I'd post an update since I found my answer. Marko Milivojevic
answered on another mailing list. As it turns out, the router still
compares metrics for the next hop even if they're not both learned from
IGPs. So, the path with an OSPF metric of 101 is losing out to a path with
a BGP-learned next hop with a MED of 0. I wouldn't have expected that
behavior at all!


On Fri, Nov 30, 2012 at 6:55 PM, John Neiberger <jneiberger at gmail.com>wrote:

> I've been doing some more testing and I even talked to a couple of guys
> from Cisco Advanced Services and I still don't understand exactly what is
> happening.
>
> To summarize, a router has two iBGP paths available for a particular
> prefix. The next hop for both paths is learned via OSPF, so the router
> selects the path with the lowest IGP metric. Then a network change occurs
> such that the next hop for one of the paths is no longer learned via OSPF.
> Instead, it is learned via BGP. The router switches to using that path, but
> I can't figure out why. It appears that a path with a next hop reachable
> via BGP is preferred, but I can't find any documentation that says that.
>
> At first I thought I'd follow the usual path selection criteria for
> choosing between two iBGP paths. At that point in the process, the first
> question is router ID. Is it switching to the path with the lowest router
> ID? Nope. What about cluster length? Nope, it's the same. Lastly, is it
> choosing this new path because the neighbor's IP address is lowest?
> Again....NO!
>
> So, what the heck? I'm really stumped.
>
>
> On Fri, Nov 30, 2012 at 2:42 PM, John Neiberger <jneiberger at gmail.com>wrote:
>
>> I ran into an interesting situation where I think I understand what is
>> happening, but I can't find any documentation about the path selection
>> process that specifically addresses this.
>>
>> We have a router--let's call it Router A--that has learned a prefix via
>> iBGP from two route reflector clients. The next hop addresses in those
>> advertisements are the loopback addresses of the advertising routers. Those
>> loopback addresses are being advertised into OSPF. So, this router has two
>> available paths for this prefix:
>>
>> 1: 4.4.4.4 (loopback address of first RR client, learned via OSPF)
>> 2: 5.5.5.5 (loopback address of second RR client, learned via OSPF)
>>
>> Now, the weirdness happens when the second router experiences
>> unidirectional traffic and stops advertising anything at all to its
>> upstream neighbor. Within just a few seconds, OSPF times out, so 5.5.5.5
>> disappears from OSPF because that router is now isolated.
>>
>> Now, you must know that Router A also has learned a default route via
>> eBGP. So, the available paths in the BGP table for a particular prefix now
>> look like this:
>>
>> 1: 4.4.4.4 (learned via OSPF)
>> 2: 5.5.5.5 (recursively reachable via 0/0 learned from eBGP)
>>
>> The router switches to the second path, errantly sending packets with a
>> next hop of 5.5.5.5--which is actually unreachable--out to the upstream
>> router advertising the default route.
>>
>> Here is the BGP table before the "outage":
>>
>> R2#show ip bgp 100.100.100.0/24
>> BGP routing table entry for 100.100.100.0/24, version 12
>> Paths: (3 available, best #3, table Default-IP-Routing-Table)
>> Flag: 0x900
>>   Advertised to update-groups:
>>         1    2    3
>>   Local, (Received from a RR-client)
>>     5.5.5.5 (metric 102) from 5.5.5.5 (100.100.100.100)
>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>   Local
>>     5.5.5.5 (metric 102) from 23.23.23.3 (35.35.35.3)
>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>       Originator: 100.100.100.100, Cluster list: 35.35.35.3
>>   Local, (Received from a RR-client)
>>     4.4.4.4 (metric 101) from 4.4.4.4 (4.4.4.4)
>>       Origin incomplete, metric 0, localpref 100, valid, internal, best
>>
>> The best path is--correctly--through 4.4.4.4. Here is what the BGP table
>> looked like when 5.5.5.5 disappeared from OSPF:
>>
>> R2#show ip bgp 100.100.100.0/24
>> BGP routing table entry for 100.100.100.0/24, version 13
>> Paths: (3 available, best #1, table Default-IP-Routing-Table)
>> Flag: 0x900
>>   Advertised to update-groups:
>>         1    2    3
>>   Local, (Received from a RR-client)
>>     5.5.5.5 from 5.5.5.5 (100.100.100.100)
>>       Origin incomplete, metric 0, localpref 100, valid, internal, best
>>   Local
>>     5.5.5.5 from 23.23.23.3 (35.35.35.3)
>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>       Originator: 100.100.100.100, Cluster list: 35.35.35.3
>>   Local, (Received from a RR-client)
>>     4.4.4.4 (metric 101) from 4.4.4.4 (4.4.4.4)
>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>
>>
>> My question is this: Why specifically does this router switch to the path
>> with a next hop learned via BGP? I'm assuming that a next hop reachable by
>> eBGP is preferred to a next hop reachable via an IGP, but I don't see
>> anything in the stated path selection process that would account for that.
>> Nothing else about the paths changed since BGP hadn't changed. The only
>> thing that changed was how the second loopback was reachable. It originally
>> was OSPF and now is recursively reachable through the default route.
>>
>> Once the BGP session to the failed router times out, the second path is
>> removed and all is well again.
>>
>> Any thoughts? I feel like I'm missing something obvious. I've been
>> staring at this too long.  :)
>>
>
>


More information about the cisco-nsp mailing list