[c-nsp] OPSF / BFD timer advise

Sat Jul 14 04:56:03 EDT 2007

James Worley <> wrote on Thursday, July 12, 2007 4:05 PM:

> Hi Phil
> 
> We have about 40 6509-sup720 running
> 's72033-advipservicesk9_wan-mz.122-18.SXF5' this particular site is
> configured in a 4 core model.

SXF's BFD implementation is still subject to false alarms in case of
CPUHOG or extensive high-cpu situations. You definitly want to do
"process-max-time 50" to work around some of the issues, but the
implementation is much more robust in SRA (and will be in SXH as well).
250x4 is is not that aggressive, but it can still fail.

> I would agree with you that a 1sec OSPF dead timer is over kill,
> especially as we have sub second timers on BFD. My only question now
> is what people suggest we set the OSPF timers to?

default. Running tuned OSPF hello/dead timers on a BFD protected link is
over-engineered IMHO. BFD is much better suited to send/process fast
hellos than any IGP will ever be. 

> As I understand BFD its able to detect link failure. The OSPF timer
> still need to be quick enough to cause convergence should the problem
> not be link failure or put another way as a back up to BFD.

You generally tune OSPF hello/dead to detect neighbor failures. This is
done via BFD, so what's the point running fast hellos?

Did you also tune SPF/LSA throttle and the like? Quick neighbor
detection doesn't help you achieving fast convergence if your OSPF nodes
wait 5 seconds before even starting to calculate the new topology.

> We are not entirely sure what is causing the issue. The syslog only
> show the OSPF neighbour as down. Strangely the outage can last a few
> minutes before syslog reports the neighbour as back up and the M-VPNS
> are back up. 

Well, your OSPF-5-ADJCHG should show something like "Neighbor Down: BFD
node down" at the end if the error was detected by BFD, and "Neighbor
Down: Dead timer expired" if OSPF detected this itself.

	oli

> 
>> What device and what IOS are you running this on?  Are the
>> adjacencies are lost due to a BFD event or an OSPF dead timer
>> expiration?  Using a 1 second dead timer and BFD-enabled OSPF is
>> overkill imho, but you need to find out what is causing the failures.
>> 
>> Phil
>> 
>> 
>> 
>> On Jul 12, 2007, at 8:38 AM, James Worley wrote:
>> 
>>> 
>>> 
>>> Ola Gents
>>> 
>>> We are having a problem with OSPF neigbour relationships being
>>> taken down between core and distribution. This is causing our
>>> MPLS-VPNs to fall over. 
>>> 
>>>  ip ospf dead-interval minimal hello-multiplier 4
>>> 
>>>  bfd interval 250 min_rx 250 multiplier 4
>>> 
>>> looking at the config above it looks like my company is running sub
>>> second timers on both BFD and OSPF and a possible reason for an
>>> overly sensitive network.
>>> 
>>> My feeling is we need to let off on the OSPF timers and allow BFD
>>> to do its job. Possible configure a 1sec hello and 3 dead timer for
>>> OSPF? 
>>> 
>>> Anyone fancy commenting?
>>> 
>>> 
>>> Kindest Regards
>>> James
>>> 
>>> _______________________________________________
>>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>> 
>> 
>> 
>> 
> 
> 
> Kindest Regards
> James
> 
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/