[c-nsp] bgp keepalive/hold timers on ethernet links

Niels Bakker niels=cisco-nsp at bakker.net
Wed Aug 31 12:37:33 EDT 2005


* mrz at velvet.org (matthew zeier) [Wed 31 Aug 2005, 16:59 CEST]:
>BGP's default 60/180 timers are too long to go before dropping the peer. 
>The PHB wanted 1 second keepalives and a 3 second hold timer.  However, 
>as soon as I started pulling in traffic (and only 50Mbps), I began 
>frequently dropping the the peer.

Fire the PHB because he's meddling with engineering tasks he obviously
has no clue about.

BGP keepalive timers are as high as they are because routers sometimes
experience short-term CPU starvation and skip on processing keepalives
and updates for a while:

>Aug 30 17:42:06.016 PDT: %BGP-5-ADJCHANGE: neighbor 216.x.253.233 Up
>Aug 30 17:49:40.612 PDT: %BGP-5-ADJCHANGE: neighbor 216.x.253.233 Down BGP Notification sent

... for obviously longer than the three seconds your PHB made you 
configure.

Some Cisco routers are more susceptible to this than others as the CPU 
can be more in the packet forwarding path in some architectures.


>I'm guessing that these timers are too aggressive - anyone have any 
>practical suggestions on how to fix this?

Tune them down to more sane levels.  JunOS uses 30/90 by default.  If 
the peer's router experiences CPU exhaustion even this can be too short, 
especially as packet forwarding may be relatively unimpacted.

There's something in the works called BFD that may help your situation. 
Last I checked it was at I-D stage.


	-- Niels.

-- 


More information about the cisco-nsp mailing list