[c-nsp] BGP Path Selection and next-hop reachability (IGP vs BGP)

Jason Lixfeld jason at lixfeld.ca
Fri Nov 30 22:13:43 EST 2012


Just a stab in the dark here, but when 5.5.5.5's next hop turns to the eBGP learned 0.0.0.0, the igp metric gets turfed (is that the same thing as getting reset to 0?) which is less than 101 (or the igp metric to 4.4.4.4).  Based on http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080094431.shtml, step 6 suggests that it will prefer the path with the lowest MED, which it might be matching if the turfed igp metric turns into 0.  One of the bullets in step 6 may support that theory:  "Paths received with no MED are assigned a MED of 0, unless you have enabled bgp bestpath med missing-as-worst.".

Like I said, stab in the dark...  I might be talking directly out of my ass.

On 2012-11-30, at 8:55 PM, John Neiberger <jneiberger at gmail.com> wrote:

> I've been doing some more testing and I even talked to a couple of guys
> from Cisco Advanced Services and I still don't understand exactly what is
> happening.
> 
> To summarize, a router has two iBGP paths available for a particular
> prefix. The next hop for both paths is learned via OSPF, so the router
> selects the path with the lowest IGP metric. Then a network change occurs
> such that the next hop for one of the paths is no longer learned via OSPF.
> Instead, it is learned via BGP. The router switches to using that path, but
> I can't figure out why. It appears that a path with a next hop reachable
> via BGP is preferred, but I can't find any documentation that says that.
> 
> At first I thought I'd follow the usual path selection criteria for
> choosing between two iBGP paths. At that point in the process, the first
> question is router ID. Is it switching to the path with the lowest router
> ID? Nope. What about cluster length? Nope, it's the same. Lastly, is it
> choosing this new path because the neighbor's IP address is lowest?
> Again....NO!
> 
> So, what the heck? I'm really stumped.
> 
> 
> On Fri, Nov 30, 2012 at 2:42 PM, John Neiberger <jneiberger at gmail.com>wrote:
> 
>> I ran into an interesting situation where I think I understand what is
>> happening, but I can't find any documentation about the path selection
>> process that specifically addresses this.
>> 
>> We have a router--let's call it Router A--that has learned a prefix via
>> iBGP from two route reflector clients. The next hop addresses in those
>> advertisements are the loopback addresses of the advertising routers. Those
>> loopback addresses are being advertised into OSPF. So, this router has two
>> available paths for this prefix:
>> 
>> 1: 4.4.4.4 (loopback address of first RR client, learned via OSPF)
>> 2: 5.5.5.5 (loopback address of second RR client, learned via OSPF)
>> 
>> Now, the weirdness happens when the second router experiences
>> unidirectional traffic and stops advertising anything at all to its
>> upstream neighbor. Within just a few seconds, OSPF times out, so 5.5.5.5
>> disappears from OSPF because that router is now isolated.
>> 
>> Now, you must know that Router A also has learned a default route via
>> eBGP. So, the available paths in the BGP table for a particular prefix now
>> look like this:
>> 
>> 1: 4.4.4.4 (learned via OSPF)
>> 2: 5.5.5.5 (recursively reachable via 0/0 learned from eBGP)
>> 
>> The router switches to the second path, errantly sending packets with a
>> next hop of 5.5.5.5--which is actually unreachable--out to the upstream
>> router advertising the default route.
>> 
>> Here is the BGP table before the "outage":
>> 
>> R2#show ip bgp 100.100.100.0/24
>> BGP routing table entry for 100.100.100.0/24, version 12
>> Paths: (3 available, best #3, table Default-IP-Routing-Table)
>> Flag: 0x900
>>  Advertised to update-groups:
>>        1    2    3
>>  Local, (Received from a RR-client)
>>    5.5.5.5 (metric 102) from 5.5.5.5 (100.100.100.100)
>>      Origin incomplete, metric 0, localpref 100, valid, internal
>>  Local
>>    5.5.5.5 (metric 102) from 23.23.23.3 (35.35.35.3)
>>      Origin incomplete, metric 0, localpref 100, valid, internal
>>      Originator: 100.100.100.100, Cluster list: 35.35.35.3
>>  Local, (Received from a RR-client)
>>    4.4.4.4 (metric 101) from 4.4.4.4 (4.4.4.4)
>>      Origin incomplete, metric 0, localpref 100, valid, internal, best
>> 
>> The best path is--correctly--through 4.4.4.4. Here is what the BGP table
>> looked like when 5.5.5.5 disappeared from OSPF:
>> 
>> R2#show ip bgp 100.100.100.0/24
>> BGP routing table entry for 100.100.100.0/24, version 13
>> Paths: (3 available, best #1, table Default-IP-Routing-Table)
>> Flag: 0x900
>>  Advertised to update-groups:
>>        1    2    3
>>  Local, (Received from a RR-client)
>>    5.5.5.5 from 5.5.5.5 (100.100.100.100)
>>      Origin incomplete, metric 0, localpref 100, valid, internal, best
>>  Local
>>    5.5.5.5 from 23.23.23.3 (35.35.35.3)
>>      Origin incomplete, metric 0, localpref 100, valid, internal
>>      Originator: 100.100.100.100, Cluster list: 35.35.35.3
>>  Local, (Received from a RR-client)
>>    4.4.4.4 (metric 101) from 4.4.4.4 (4.4.4.4)
>>      Origin incomplete, metric 0, localpref 100, valid, internal
>> 
>> 
>> My question is this: Why specifically does this router switch to the path
>> with a next hop learned via BGP? I'm assuming that a next hop reachable by
>> eBGP is preferred to a next hop reachable via an IGP, but I don't see
>> anything in the stated path selection process that would account for that.
>> Nothing else about the paths changed since BGP hadn't changed. The only
>> thing that changed was how the second loopback was reachable. It originally
>> was OSPF and now is recursively reachable through the default route.
>> 
>> Once the BGP session to the failed router times out, the second path is
>> removed and all is well again.
>> 
>> Any thoughts? I feel like I'm missing something obvious. I've been staring
>> at this too long.  :)
>> 
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/




More information about the cisco-nsp mailing list