[c-nsp] IBGP Routing Trouble on 6509

Richard J. Sears rsears at adnc.com
Mon Jun 26 21:53:29 EDT 2006


Hi - 

I am seeing a very strange routing issue on my 6509s and was hoping
someone may have seen before.


I have 3 backbone routers connecting to a total of 7 backbone providers.
These are Cisco GSR12000 series routers. I am running BGP with all
providers.

>From each of my backbone routers I connect via gig to two 6509
switch/routers running the SUP720 engines with the PFC3A and MSFC3 cards.
I am running Version 12.2(18)SXD3 on both 6509s and each have 512MB RAM.
I am running IOS only with no CAT code.

I have IBGP session between all routers.

Next I have a performance routing unit installed on the network which I
have been running for several years with no problems. 

Basically the performance unit determines the best path for the traffic
and utilizing IBGP local_pref updates the route table on my network to
send the traffic to the correct backbone.

So lets assume that the performance routing hardware says to send
4.36.116.0/24 to AS7911.

When I issue a sh ip ro 4.36.116.0 I get this:

AR01#sh ip ro 4.36.116.0   
Routing entry for 4.36.116.0/24
  Known via "bgp 6130", distance 200, metric 0
  Tag 7911, type internal
  Last update from 206.71.160.254 01:11:36 ago
  Routing Descriptor Blocks:
  * 206.71.160.254, from 206.251.233.245, 01:11:36 ago
      Route metric is 0, traffic share count is 1
      AS Hops 3
      Route tag 7911

And a sh ip bg 4.36.116.0:

AR01#sh ip bg | i 4.36.116.0
* i4.36.116.0/24    206.71.160.254                140      0 7911 174 21889 i

(the local_pref of 140 indicates my performance hardware has selected
that route)


the route appears correctly in the routing table, and doing a sh ip bgp
shows the correct local_pref, netblock and nethop IPs.

Now the real weird part - if I traceroute to that netblock from the 6509,
it goes the correct path, if I traceroute from a machine connected to
the 6509, it fails, bouncing between my 6509 and one of my backbone
routers.

The only way to clear up this problem that I have discovered is to issue
a clear ip route 4.36.116.0 (or whatever the effected netblock is) or in
the case multiple netblocks are effected, then a clear ip route *. This
always fixes the problem and the fix has lasted several weeks at times.. 

As soon as I do that, the traceroutes work just fine and all of the
routes still appear exactly as they did before.

My question is why would the routing and bgp tables show the correct
route, yet the router be sending the traffic to the wrong backbone
router until you issue a clear ip ro x.x.x.x  ?

Could something in the CEF table/FIB/RIB be getting corrupted and how
would I troubleshoot the problem..?

Both units have been up for over 17 weeks with no interesting log
entries. Memory utilization remains constant at around 225mb in use, cpu
usage remains around 4% and only jumps when the bgp scanner runs. We are
running BGP and OSPF.


Thanks for any help or light anyone can shed!!


******************************************
Richard J. Sears
CCNP/CCDP/F5SE




More information about the cisco-nsp mailing list