[c-nsp] SRB1 BGP bug?
Justin Shore
justin at justinshore.com
Thu Jan 10 05:54:09 EST 2008
I'm on the tail end of a maintenance window for our 7600s running SRB1.
The window required me to completely reboot both boxes to fix an IPSec
SPA problem. I started with the secondary 7600, killing off one OSPF
connection to our old ISP core. Then I killed off iBGP sessions one by
one. Finally I set the overload bit in IS-IS to make sure all the IS-IS
neighbors ignored the routes from that box. I then removed all the
'crypto engine' config lines, wrote the config and reloaded. The box
came up ok. I made all my crypto config changes, removed the overload
bit from IS-IS, re-enabled all the iBGP peers and finally brought up the
1 OSPF peer. Everything worked fine; no problems so far.
I did the same thing to the primary 7600 next including shutting down
each of the iBGP peers and writing the config (both important). When I
reloaded that 7600 I noticed that the load didn't drop as quickly as it
should have when it came back up. It usually takes 6.5 minutes to boot
and another 3-5 minutes for the load drop to normal levels as it loads
all the BGP routes. Of course I had all BGP peers shutdown and our IGP
takes only a few seconds to load; I should have had nominal load at
6.5-7 minutes. As it turns out the 7600 loaded up 4 of the 5 my iBGP
peers *even though they were admin down in the config*. It takes almost
5 minutes for the BGP scanner and RIB update processes to settle down;
that's why the load didn't drop when expected. I verified that the all
5 iBGP peers were shut in the config and that 4 of the peers were in
fact up with prefixes received.
When I unshut all 5 iBGP peers each peer dropped with an:
BGP-5-ADJCHANGE: neighbor a.b.c.d Down Capability changed
and came back up 15-30 seconds later like normal. Can anyone tell me
why the 7600 would connect to 4 of my 5 normally-used BGP peers when
they were admin down in the config? I actually have 8 peers configured
but 3 are legacy config that I haven't removed yet and are always admin
down. Perplexing. Both 7600s have single Sup720-3BXLs and are running
SRB1.
Thanks
Justin
More information about the cisco-nsp
mailing list