[j-nsp] Optimal BFD settings for BGP-signaled VPLS?

Clarke Morledge chmorl at wm.edu
Fri Jan 14 21:39:47 EST 2011


I am trying to determine the optimal Bidirectional Forwarding Detection 
(BFD) settings for BGP auto-discovery and layer-2 signaling in a VPLS 
application.

To simplify things, assume I am running LDP for building dynamic-only 
LSPs, as opposed to RSVP.  Assume I am running IS-IS as the IGP with BFD 
enabled on that, too, interconnecting all of the P and PE routers in the 
MPLS cloud.  I am following the Juniper recommendation of 300 ms mininum 
interval with 3 misses before calling a BFD down event.

The network design has a small set of core routers, each one of these 
routers serves as a BGP route reflector.  All of the core routers have 
inter-meshed connections.  Each core router is only one hop away from the 
other.

On the periphery, I have perhaps dozens of distribution routers.  Each 
distribution router is  directly connected to two or more core routers. 
Each distribution router has a BGP session to these core routers; 
therefore, each distribution router is connected to two route reflectors 
for redundancy.

Given that above, what type of minimum interval BFD setting and miss count 
would you configure?  I want to try to get to a sub-second convergence 
during node/link failure, but I do not want to tune BFD too tight and 
potentially introduce unecessary flapping.  I am willing to suffer some 
sporadic loss to the layer-2 connectivity within the VPLS cloud in the 
event of a catastrophe, etc, for a few seconds, but I don't want to 
unnecessarily tear down BGP sessions and wait some 20 to 60 seconds or so 
until BGP rebuilds and redistributes L2 information.

For some time now, I have been playing with 3000 ms interval with 3 misses 
(that's 9 seconds) as what I thought was a conservative estimate. 
However, I have run into cases where there has been enough router churn 
for various reasons to uneccesarily trip a BFD down event.  My hunch is 
that this "router churn" is due to buggy JUNOS code, but I don't have 
proof of that yet.  Nevertheless, I want the BGP infrastructure to stay 
solid and ride through transient events in a redundant network.

Are there any factors that I am missing or not thinking thoroughly enough 
about when considering optimal BFD settings?

Thanks.

Clarke Morledge
College of William and Mary
Information Technology - Network Engineering
Jones Hall (Room 18)
Williamsburg VA 23187


More information about the juniper-nsp mailing list