[c-nsp] Sanity check OSPF/BGP

Tue Oct 13 14:55:54 EDT 2020

Hello,

For clarity's sake:

Each of the 5 edge/peering routers have the full routing table installed and they are totally unaware of one another as far as BGP is concerned 

Downstream from there, the view from any of the four core routers is this:

Neighbor         V          MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc
  192.168.222.25 4         45082982    171835    0    0    6d05h Estab   812608 812608
  192.168.222.26 4         56573846    173130    0    0   18d04h Estab   812623 812623
  192.168.222.27 4         45082982    171835    0    0    6d02h Estab   812609 812609
  192.168.222.28 4         56573846    173130    0    0  118d04h Estab   812625 812625
  192.168.222.29 4         45082982    171835    0    0    7d02h Estab   812607 812607

The issue I was running into and asking about was regarding the delay between when OSPF closes (next-hop is no longer reachable) and when the next-hop that is no longer reachable stops being used as a route to a destination.

Not only is the next hop unreachable once OSPF closes, there isn't even a route to that next hop anymore. 

So the reason I asked the question was to validate my thinking that if there is no longer a route to the next-hop than the router shouldn't be waiting for the hold timer to expire prior to selecting a different path.

But I still need to validate a few things.

Thanks,
-Drew

-----Original Message-----
From: adamv0025 at netconsultings.com <adamv0025 at netconsultings.com> 
Sent: Monday, October 12, 2020 10:40 AM
To: Drew Weaver <drew.weaver at thenap.com>; cisco-nsp at puck.nether.net
Subject: RE: [c-nsp] Sanity check OSPF/BGP

> Drew Weaver
> Sent: Thursday, October 8, 2020 2:01 PM What I expect to happen is:
> 
>               The route to the peering edge router's loopback 
> interface is withdrawn when OSPF/OSPFv3 closes.
>               The core router will close the BGP session when the 
> route to
the dead
> peering edge router is withdrawn and will begin using one of the 5 
> other copies of the same route that it has.
>

Number of things come to mind since you provided no details regarding the setup

Case A)
If all 5 peering points are not advertising best-external prefixes -then there's only a single path for each of the 700K prefixes in the entire AS via one of the 5 peering points.
-in case one peering point fails all prefixes it offered a best path for will be withdrawn from all BGP speakers in the AS at OSPF convergence speeds, but then the remaining 4 peering points needs to realize they now have the overall best path for a given prefix and start advertising it to all BGP speakers in the AS -tedious process that converges at "BGP-speed".

Case B)
If all 5 peering points are advertising best-external prefixes and all BGP speakers in the AS already have all 5 paths available in RIB, but none of the BGP speakers has hierarchical FIB so there's a direct correlation between a prefix and it's NH, -in case one peering point fails all prefixes it offered a best path for will be withdrawn from all BGP speakers in the AS at OSPF convergence speeds, but now each BGP speaker will need to painstakingly update its FIB on a prefix by prefix bases for each of the each of the 700K prefixes. 

Case C)
If all 5 peering points are advertising best-external prefixes and all BGP speakers in the AS already have all 5 paths available in RIB, and all BGP speakers not even have hierarchical FIBs but also PIC-CORE enabled where a backup path for each prefix is programmed in FIB, -in case one peering point fails all prefixes it offered a best path for will be withdrawn from all BGP speakers in the AS at OSPF convergence speeds and each BGP speaker will then just need to change 5 HN pointers to point to remaining 4 peering points in FIB.

Note,
The above assumes full mesh between all BGP speakers or otherwise assumes the RR infrastructure emulates full-mesh with regards to prefix distribution to all BGP speakers in the AS via one of the several available mechanisms.  

Adam