[c-nsp] 7606 to 6509 [BGP hold time issue]

Scantlebury, Kieron Kieron.Scantlebury at Level3.com
Thu May 3 10:55:09 EDT 2012


Hi Guru's

I have a Cisco 7606-S with a 1 gig DIA link to our customers Cisco 6509 switch.
This is directly connected just a few cabinets down in the same COLO.

The link is stable and we have no errors.

The problem we are seeing is that BGP is getting ripped down 3 minute (hold timer) - See below

*FYI - The obvious has been checked. Hold timers match on each device. Tried adjusting MTU etc...

May  3 05:00:10.490 BST: %BGP-5-ADJCHANGE: neighbor *** .***.***.*** Up
May  3 05:03:11.302 BST: %BGP-5-ADJCHANGE: neighbor *** .***.***.*** Down BGP Notification sent
May  3 05:00:00.866 BST: %BGP-3-NOTIFICATION: sent to neighbor *** .***.***.*** 4/0 (hold time expired) 0 bytes

I have done a few packet captures. It appears that the customers 6509 isn't acknowledging the BGP update packets and so our ASR try's to re-transmit packets (that are above 1300 bytes +) again and again. The customer is unable to do a PCAP so I cant Guarantee that the update packet is hitting his device.

The customer requires the full routing table. Some 39000+ routes. BGP only drops when sending this table. If I limit these advertisements to say 1.23.0.0/16 le 32 the BGP stays stable. If I advertise a full 1.0.0.0/8 le 32 then it becomes unstable again.

We do have a known issue on our 7606 at the moment, TCAM is full. This issue is being resolved this coming weekend. However I don't believe that this will be the cause of our BGP issue. Unless of course this issue is effecting the overall performance of our router. (That's for another team to worry about :))

An idea that's been thrown around is that the customers 6509 doesn't have enough memory to support the full routing table. Here are some outputs from his switch.

#####show proc mem | i BGP Router
396   0 4160646648 3294746732  284982876          0          0 BGP Router

#####show ip bgp summary
408443 network entries using 48196274 bytes of memory


#####show version
cisco WS-C6503-E (R7000) processor (revision 1.3) with 983008K/65536K bytes of memory.
Processor board ID FOX1350GH8H
SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
Last reset from s/w reset
9 Virtual Ethernet interfaces
66 Gigabit Ethernet interfaces
1917K bytes of non-volatile configuration memory.
8192K bytes of packet buffer memory.

65536K bytes of Flash internal SIMM (Sector size 512K).
Configuration register is 0x2102

Also, There is nothing in their logs to indicate a memory issue.

Any further ideas would be appreciated.

Many Thanks in advance.
Kieron






More information about the cisco-nsp mailing list