[j-nsp] Prefix independent convergence and FIB backup path

Olivier Benghozi olivier.benghozi at wifirst.fr
Thu Feb 8 08:42:34 EST 2018


Hi Mark,


Here (VPNv4/v6, BGP PIC Core + PIC Edge, no addpath as not supported in vpn AFI) we can see that, when possible:
active eBGP path is backuped via iBGP path
active iBGP path is backuped via another iBGP path

We don't see:
active iBGP backuped via inactive eBGP
active eBGP backuped via another inactive eBGP


I understand it works the way it was advertised for:
PIC Core: protect against another PE failure, as detected via IGP
PIC Edge: protect against a local PE-CE link failure (CE being an IP transit by example)
The key idea seems to react immediately to a link loss quickly detected, that is locally or via IGP ; not to something signalled (potentially slowly) via iBGP.
However, protecting iBGP routes using eBGP paths makes sense, as a PE can be lost and quickly detected.

It would be interesting to check if "protecting iBGP routes using eBGP ones" or "active eBGP using inactive eBGP" are implemented on Cisco IOS-XR gears in their BGP PIC implementation.


Note that in your case (in inet.0) there's no BGP PIC Edge feature, as I understand it's just a special PIC feature needed for labelled paths toward outside, and you can see that BGP PIC Core for inet already covers your eBGP routes in inet.0.


Also note that in your case, PE2 (at least when using NHS) cannot quickly detect a TRA1 loss anyway, so there's no usecase here, in fact...

Of course you already know that having both TRA1 and TRA2 with the same localpref does the trick (even without addpath), but it's not what you intended to test :)


Olivier

> Le 8 févr. 2018 à 13:02, Mark Smith <markrefresh12 at gmail.com> a écrit :
> 
> Hi list,
> 
> Test topology below. 2x MX80 with dual ip transit (full table ~600k
> prefixes). TRA1 preferred over TRA2 (Localpref 200 set by PE1 import
> policy). Plain unlabeled inet.0, no mpls in use. In lab topology both
> transits belong to same AS65502.
> 
> What I'm trying to accomplish is somewhat faster failover time in case
> of primary transit failure. In case of no tuning the failover (FIB
> programming) can take up to 10 minutes.
> 
> 
> --------        --------
> | TRA1 |        | TRA2 |   AS65502
> --------        --------
>   | xe-1/3/0      | xe-1/3/0
> -------         -------
> | PE1 | --ae0-- | PE2 |    AS65501
> -------         -------
>   |
> -----------
> | test pc |
> -----------
> 
> In the lab PE1 and PE2 are MX80s running 15.1R6.7.
> I have configured BGP add-path and PIC edge (routing-options protect
> core) on both PEs.
> All looks ok on PE1. Both primary and backup paths are installed in
> FIB. PE1 converges fast.
> The backup path is missing in PE2 FIB. When PE1-TRA1 cable is cut PE1
> quickly switches to backup path but PE2 does not and the result is a
> temporary routing loop between PE1 and PE2.
> If I switch the active transit to PE2 (set LP220 on TRA2 import on
> PE2, no other changes), the situation is reversed. All looks ok on PE2
> but not on PE1. So it looks like the PIC works only on the box
> connected to primary transit (=EBGP route is better than IBGP route).
> NHS/no-NHS on ibgp export does not have an effect. Is this a bug,
> feature, or am I doing something wrong?
> 
> I know that a better solution could be to get rid of full table and
> just use 2x default route from upstream... anyways I would like to get
> more familiar with PIC.
> 
> Stable situation, all ok on PE1:
> admin at PE1> show route table inet.0 8.8.8.8
> 
> inet.0: 607797 destinations, 1823329 routes (607797 active, 0
> holddown, 0 hidden)
> @ = Routing Use Only, # = Forwarding Use Only
> + = Active Route, - = Last Active, * = Both
> 
> 8.8.8.0/24         @[BGP/170] 05:03:44, localpref 200
>                      AS path: 65502 65200 25091 15169 I,
> validation-state: unverified
>> to 10.100.100.133 via xe-1/3/0.0
>                    [BGP/170] 05:05:55, localpref 100, from 10.100.100.40
>                      AS path: 65502 65200 25091 15169 I,
> validation-state: unverified
>> to 10.100.100.137 via ae0.0
>                   #[Multipath/255] 05:02:54
>> to 10.100.100.133 via xe-1/3/0.0
>                      to 10.100.100.137 via ae0.0
> 
> admin at PE1> show route forwarding-table destination 8.8.8.8 table
> default extensive
> Routing table: default.inet [Index 0]
> Internet:
> 
> Destination:  8.8.8.0/24
>  Route type: user
>  Route reference: 0                   Route interface-index: 0
>  Multicast RPF nh index: 0
>  Flags: sent to PFE, rt nh decoupled
>  Next-hop type: unilist               Index: 1048575  Reference: 607767
>  Nexthop: 10.100.100.133
>  Next-hop type: unicast               Index: 826      Reference: 4
>  Next-hop interface: xe-1/3/0.0    Weight: 0x1
>  Nexthop: 10.100.100.137
>  Next-hop type: unicast               Index: 827      Reference: 3
>  Next-hop interface: ae0.0         Weight: 0x4000
> 
> 
> But not on PE2:
> admin at PE2> show route table inet.0 8.8.8.8
> 
> inet.0: 607798 destinations, 1215564 routes (607798 active, 607766
> holddown, 0 hidden)
> @ = Routing Use Only, # = Forwarding Use Only
> + = Active Route, - = Last Active, * = Both
> 
> 8.8.8.0/24         *[BGP/170] 00:02:10, localpref 200, from 10.100.100.30
>                      AS path: 65502 65200 25091 15169 I,
> validation-state: unverified
>> to 10.100.100.136 via ae0.0
>                    [BGP/170] 1d 01:54:47, localpref 100
>                      AS path: 65502 65200 25091 15169 I,
> validation-state: unverified
>> to 10.100.100.134 via xe-1/3/0.0
> 
> admin at PE2> show route forwarding-table destination 8.8.8.8 table
> default extensive
> Routing table: default.inet [Index 0]
> Internet:
> 
> Destination:  8.8.8.0/24
>  Route type: user
>  Route reference: 0                   Route interface-index: 0
>  Multicast RPF nh index: 0
>  Flags: sent to PFE
>  Next-hop type: indirect              Index: 1048574  Reference: 607767
>  Nexthop: 10.100.100.136
>  Next-hop type: unicast               Index: 790      Reference: 11
>  Next-hop interface: ae0.0
> 
> 
> 
> During TRA1 failure before PE2 convergence
> --------------------------------------------
> 
> [root at test-pc ~]# traceroute -n 8.8.8.8
> traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
> 1  192.168.23.1  0.542 ms  0.574 ms  0.543 ms
> 2  10.100.100.137  0.289 ms  0.274 ms  0.250 ms
> 3  10.100.100.136  0.533 ms  0.521 ms  0.508 ms
> 4  10.100.100.137  0.299 ms  0.283 ms  0.276 ms
> 5  10.100.100.136  0.442 ms  0.401 ms  0.388 ms
> 6  10.100.100.137  0.325 ms  0.271 ms  0.257 ms
> 7  10.100.100.136  0.298 ms  0.297 ms  0.314 ms
> 8  10.100.100.137  0.316 ms  0.310 ms  0.288 ms
> 9  10.100.100.136  0.264 ms  0.382 ms  0.303 ms
> 10  10.100.100.137  0.339 ms  0.326 ms  0.315 ms
> 11  10.100.100.136  0.348 ms  0.331 ms  0.306 ms
> 12  10.100.100.137  0.297 ms  0.353 ms  0.330 ms
> 13  10.100.100.136  0.347 ms  0.338 ms  0.316 ms
> 14  10.100.100.137  0.346 ms  0.324 ms  0.300 ms
> 15  10.100.100.136  0.329 ms  0.352 ms  0.334 ms
> 16  10.100.100.137  0.381 ms  0.363 ms  0.353 ms
> 17  10.100.100.136  0.328 ms  0.329 ms  0.317 ms
> 18  10.100.100.137  0.475 ms  0.386 ms  0.370 ms
> 19  10.100.100.136  0.392 ms  0.373 ms  0.369 ms
> 20  10.100.100.137  0.394 ms  0.463 ms  0.407 ms
> 21  10.100.100.136  0.368 ms  0.374 ms  0.404 ms
> 22  10.100.100.137  0.457 ms  0.416 ms  0.404 ms
> 23  10.100.100.136  0.353 ms  1.448 ms  1.405 ms
> 24  10.100.100.137  0.468 ms  0.455 ms  0.475 ms
> 25  10.100.100.136  1.240 ms  1.276 ms  1.256 ms
> 26  10.100.100.137  0.438 ms  0.475 ms  0.414 ms
> 27  10.100.100.136  1.106 ms  1.097 ms  1.082 ms
> 28  10.100.100.137  0.475 ms  0.452 ms  0.415 ms
> 29  10.100.100.136  0.924 ms  0.880 ms  0.827 ms
> 30  10.100.100.137  0.459 ms  0.443 ms  0.423 ms
> 
> (Note about lab: 8.8.8.8 is a loopback address on the MX router acting
> as both TRA1 and TRA2. DFZ is fed to that box with bgp_simple script)
> 
> 
> Thanks
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



More information about the juniper-nsp mailing list