[c-nsp] Long list of route-maps

Thu Mar 11 12:42:11 EST 2010

On Thu, Mar 11, 2010 at 06:19:16PM +0100, Andy B. wrote:
> I feel desperate: I just turned up a new Transit Session with an
> upstream and my router goes nuts and is dropping other BGP sessions on
> it: 4/0 (hold time expired) 0 bytes
> 
> The situation is like this:
> 
> The router is peering on a public IX with approximatively 150 members.
> Each BGP session has its own route-map, so the list is really BIG!
> 
> When I turned up my transit about an hour ago, CPU went to 100% and is
> still at 100% right now and it drops BGP peers and brings them back,
> and drops and brings them back, ... I'm in a loop and I think the only
> way to get out of that look is to bring up each bgp peer step by step
> - really not an option.
...
> What is eating my router's CPU?
> Is it the big list of route-maps?

At this point it sounds like you have self-sustaining churn, i.e. high
cpu causes bgp to flap, which in turn causes high cpu, rinse repeat. One
big downside of having a route-map per peer (outbound route-map that is)
is that it breaks update replication, which increases your CPU greatly
when advertising routes. I do it too though, so it shouldn't be
impossible. :)

Start with the following checklist:

* Are you over-rate-limiting BGP in your control plane policers? These
policers don't get applied very smoothly, i.e. it won't just gracefully
slow down the exchange of BGP messages, it will drop packets and screw
up tcp. If you are bumping against the policer during your normal
connection burst, this is likely to trigger the kind of issue you're
seeing.

* Configure "process-max-time 20".

* Make sure you don't have an absurdly low hold timer... We run 30 sec
hold timers and that it just enough to occasionally tickle IOS into
doing bad things. Also be aware that the peer can cause you to negotiate
a lower keepalive timer, and could potentially be setting it so low that
it triggers the churn cycle. You can check with "sh ip bgp nei | inc
Configured hold time".

* Make sure you don't have any funky configuration under scheduler
allocate or scheduler runtime. Many cut-and-paste examples floating
around on the Internet show things like 90% cpu forwarding (interrupts)
10% routing protocols, which is the exact opposite of what you want on a
hardware forwarding router.

* We occasionally hit a weird bug where cpu use in BGP just goes through 
the roof and stays there until you reboot the box. Normally we don't get 
into a self-sustaining churn cycle with it, but it is definitely bad 
enough to break traceroute and make using the cli painful (well, more 
painful that normal at any rate :P). We've seen this in all versions of 
SRC and SRD at any rate, except SRD4 which we don't have enough 
deployment experience with yet. Of course if you have a self-sustaining 
churn caused by other reasons, rebooting can actually help trigger it, 
since now you have all your BGP sessions trying to come up at once. :)

-- 
Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)