[c-nsp] Faster BGP Failover

Robert Raszuk robert at raszuk.net
Thu Oct 13 03:37:18 EDT 2011


> Failure detection is only the first step in BGP reconvergence - and if
> you have two links to different ISPs, reconvergence times will be in the
> order of minutes anyway, so tuning down fault detection from "30s" to "1s"
> will not make failover instant anyway.

Very interesting discussion ... sorry for late reading, but I was in 
long haul transit :)

Few observations ... There are few components ...

- preparation
- detection
- reaction

Let me explain.

- preparation ... it really means that you should have a backup path in 
place via different exit point (or multiple local paths what may not be 
easy). If your other exit is via different ASBR I do recommend at the 
current state of the technology to use Diverse-Path on RR to send backup 
path towards the best path's ASBR. This is shipping feature in cisco 
*well yes I am biased .. I have designed that one :)* When all routers 
talk add-paths you could switch to that.

- detection ... reducing hold time in BGP is a bad idea. if the goal is 
to converge in 1-3 sec max I recommend (if BFD is not an option - even 
single side BFD) use a very unknown IOS feature called "Object 
Tracking". It is a piece of excellent code written by cisco EMEA 
developers which can invalidate your next hop based on periodic ping 
(every time X) to peering ASBR. Very cool tool.

- reaction ... yes PIC is the best way. You preprogram to FIB backup 
paths then just at the failure time switchover to backup not per net but 
per next hop speed at any BGP switching node ... assuming no intra-as 
tunneling. If there is tunneling you just need to switch once.

Best,
R.


More information about the cisco-nsp mailing list