[c-nsp] 7606 to 6509 [BGP hold time issue]

Thu May 3 11:13:57 EDT 2012

You have an MTU mismatch.:)

THis is my guess anyway because it really matches closely your issue.

I ran in to this with almost the same set up using larger MTU sizes for the ethernet + tags.  I had to use the IP MTU command under the actual interface (or subiff depending) and set to 1500.

You can easily tell by 
show ip bgp nei a.b.c.d | inc data

look at the segment size and make sure that it makes with the MTU you have set including overhead.

In my case, I was getting number greater than 1460 which in my setup I knew wouldn't fly.

Hope that helps.

Thanks
Scott

On May 3, 2012, at 10:55 AM, Scantlebury, Kieron wrote:

> Hi Guru's
> 
> I have a Cisco 7606-S with a 1 gig DIA link to our customers Cisco 6509 switch.
> This is directly connected just a few cabinets down in the same COLO.
> 
> The link is stable and we have no errors.
> 
> The problem we are seeing is that BGP is getting ripped down 3 minute (hold timer) - See below
> 
> *FYI - The obvious has been checked. Hold timers match on each device. Tried adjusting MTU etc...
> 
> May  3 05:00:10.490 BST: %BGP-5-ADJCHANGE: neighbor *** .***.***.*** Up
> May  3 05:03:11.302 BST: %BGP-5-ADJCHANGE: neighbor *** .***.***.*** Down BGP Notification sent
> May  3 05:00:00.866 BST: %BGP-3-NOTIFICATION: sent to neighbor *** .***.***.*** 4/0 (hold time expired) 0 bytes
> 
> I have done a few packet captures. It appears that the customers 6509 isn't acknowledging the BGP update packets and so our ASR try's to re-transmit packets (that are above 1300 bytes +) again and again. The customer is unable to do a PCAP so I cant Guarantee that the update packet is hitting his device.
> 
> The customer requires the full routing table. Some 39000+ routes. BGP only drops when sending this table. If I limit these advertisements to say 1.23.0.0/16 le 32 the BGP stays stable. If I advertise a full 1.0.0.0/8 le 32 then it becomes unstable again.
> 
> We do have a known issue on our 7606 at the moment, TCAM is full. This issue is being resolved this coming weekend. However I don't believe that this will be the cause of our BGP issue. Unless of course this issue is effecting the overall performance of our router. (That's for another team to worry about :))
> 
> An idea that's been thrown around is that the customers 6509 doesn't have enough memory to support the full routing table. Here are some outputs from his switch.
> 
> #####show proc mem | i BGP Router
> 396   0 4160646648 3294746732  284982876          0          0 BGP Router
> 
> #####show ip bgp summary
> 408443 network entries using 48196274 bytes of memory
> 
> 
> #####show version
> cisco WS-C6503-E (R7000) processor (revision 1.3) with 983008K/65536K bytes of memory.
> Processor board ID FOX1350GH8H
> SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
> Last reset from s/w reset
> 9 Virtual Ethernet interfaces
> 66 Gigabit Ethernet interfaces
> 1917K bytes of non-volatile configuration memory.
> 8192K bytes of packet buffer memory.
> 
> 65536K bytes of Flash internal SIMM (Sector size 512K).
> Configuration register is 0x2102
> 
> Also, There is nothing in their logs to indicate a memory issue.
> 
> Any further ideas would be appreciated.
> 
> Many Thanks in advance.
> Kieron
> 
> 
> 
> 
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/