[c-nsp] BGP Path Selection and next-hop reachability (IGP vs BGP)

Oliver Boehmer (oboehmer) oboehmer at cisco.com
Sat Dec 1 17:37:36 EST 2012


Ack.. BGP in IOS will use whatever valid route to resolve the next-hop,
including a BGP route (which could be perfectly valid in some specific
scenarios). 
You need to configure selective address tracking to avoid this to happen.
With an appropriate "bgp nexthop route-map .." config, for example only
considering loopbacks or only OSPF or connected routes, BGP would consider
5.5.5.5 unreachable, and ignore this path for the best-path calculation.

	oli

On 01/12/2012 04:15, "John Neiberger" <jneiberger at gmail.com> wrote:

>I thought I'd post an update since I found my answer. Marko Milivojevic
>answered on another mailing list. As it turns out, the router still
>compares metrics for the next hop even if they're not both learned from
>IGPs. So, the path with an OSPF metric of 101 is losing out to a path with
>a BGP-learned next hop with a MED of 0. I wouldn't have expected that
>behavior at all!
>
>
>On Fri, Nov 30, 2012 at 6:55 PM, John Neiberger
><jneiberger at gmail.com>wrote:
>
>> I've been doing some more testing and I even talked to a couple of guys
>> from Cisco Advanced Services and I still don't understand exactly what
>>is
>> happening.
>>
>> To summarize, a router has two iBGP paths available for a particular
>> prefix. The next hop for both paths is learned via OSPF, so the router
>> selects the path with the lowest IGP metric. Then a network change
>>occurs
>> such that the next hop for one of the paths is no longer learned via
>>OSPF.
>> Instead, it is learned via BGP. The router switches to using that path,
>>but
>> I can't figure out why. It appears that a path with a next hop reachable
>> via BGP is preferred, but I can't find any documentation that says that.
>>
>> At first I thought I'd follow the usual path selection criteria for
>> choosing between two iBGP paths. At that point in the process, the first
>> question is router ID. Is it switching to the path with the lowest
>>router
>> ID? Nope. What about cluster length? Nope, it's the same. Lastly, is it
>> choosing this new path because the neighbor's IP address is lowest?
>> Again....NO!
>>
>> So, what the heck? I'm really stumped.
>>
>>
>> On Fri, Nov 30, 2012 at 2:42 PM, John Neiberger
>><jneiberger at gmail.com>wrote:
>>
>>> I ran into an interesting situation where I think I understand what is
>>> happening, but I can't find any documentation about the path selection
>>> process that specifically addresses this.
>>>
>>> We have a router--let's call it Router A--that has learned a prefix via
>>> iBGP from two route reflector clients. The next hop addresses in those
>>> advertisements are the loopback addresses of the advertising routers.
>>>Those
>>> loopback addresses are being advertised into OSPF. So, this router has
>>>two
>>> available paths for this prefix:
>>>
>>> 1: 4.4.4.4 (loopback address of first RR client, learned via OSPF)
>>> 2: 5.5.5.5 (loopback address of second RR client, learned via OSPF)
>>>
>>> Now, the weirdness happens when the second router experiences
>>> unidirectional traffic and stops advertising anything at all to its
>>> upstream neighbor. Within just a few seconds, OSPF times out, so
>>>5.5.5.5
>>> disappears from OSPF because that router is now isolated.
>>>
>>> Now, you must know that Router A also has learned a default route via
>>> eBGP. So, the available paths in the BGP table for a particular prefix
>>>now
>>> look like this:
>>>
>>> 1: 4.4.4.4 (learned via OSPF)
>>> 2: 5.5.5.5 (recursively reachable via 0/0 learned from eBGP)
>>>
>>> The router switches to the second path, errantly sending packets with a
>>> next hop of 5.5.5.5--which is actually unreachable--out to the upstream
>>> router advertising the default route.
>>>
>>> Here is the BGP table before the "outage":
>>>
>>> R2#show ip bgp 100.100.100.0/24
>>> BGP routing table entry for 100.100.100.0/24, version 12
>>> Paths: (3 available, best #3, table Default-IP-Routing-Table)
>>> Flag: 0x900
>>>   Advertised to update-groups:
>>>         1    2    3
>>>   Local, (Received from a RR-client)
>>>     5.5.5.5 (metric 102) from 5.5.5.5 (100.100.100.100)
>>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>>   Local
>>>     5.5.5.5 (metric 102) from 23.23.23.3 (35.35.35.3)
>>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>>       Originator: 100.100.100.100, Cluster list: 35.35.35.3
>>>   Local, (Received from a RR-client)
>>>     4.4.4.4 (metric 101) from 4.4.4.4 (4.4.4.4)
>>>       Origin incomplete, metric 0, localpref 100, valid, internal, best
>>>
>>> The best path is--correctly--through 4.4.4.4. Here is what the BGP
>>>table
>>> looked like when 5.5.5.5 disappeared from OSPF:
>>>
>>> R2#show ip bgp 100.100.100.0/24
>>> BGP routing table entry for 100.100.100.0/24, version 13
>>> Paths: (3 available, best #1, table Default-IP-Routing-Table)
>>> Flag: 0x900
>>>   Advertised to update-groups:
>>>         1    2    3
>>>   Local, (Received from a RR-client)
>>>     5.5.5.5 from 5.5.5.5 (100.100.100.100)
>>>       Origin incomplete, metric 0, localpref 100, valid, internal, best
>>>   Local
>>>     5.5.5.5 from 23.23.23.3 (35.35.35.3)
>>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>>       Originator: 100.100.100.100, Cluster list: 35.35.35.3
>>>   Local, (Received from a RR-client)
>>>     4.4.4.4 (metric 101) from 4.4.4.4 (4.4.4.4)
>>>       Origin incomplete, metric 0, localpref 100, valid, internal
>>>
>>>
>>> My question is this: Why specifically does this router switch to the
>>>path
>>> with a next hop learned via BGP? I'm assuming that a next hop
>>>reachable by
>>> eBGP is preferred to a next hop reachable via an IGP, but I don't see
>>> anything in the stated path selection process that would account for
>>>that.
>>> Nothing else about the paths changed since BGP hadn't changed. The only
>>> thing that changed was how the second loopback was reachable. It
>>>originally
>>> was OSPF and now is recursively reachable through the default route.
>>>
>>> Once the BGP session to the failed router times out, the second path is
>>> removed and all is well again.
>>>
>>> Any thoughts? I feel like I'm missing something obvious. I've been
>>> staring at this too long.  :)
>>>
>>
>>
>_______________________________________________
>cisco-nsp mailing list  cisco-nsp at puck.nether.net
>https://puck.nether.net/mailman/listinfo/cisco-nsp
>archive at http://puck.nether.net/pipermail/cisco-nsp/




More information about the cisco-nsp mailing list