[c-nsp] slow convergence for full bgp table on a Cisco7613/SUP720-3BXL
Chris Griffin
cgriffin at ufl.edu
Tue Mar 13 14:12:24 EST 2007
We have noticed peers which have many bestpath prefixes into the FIB
take a LONG time to converge after reset, but when they have few
bestpaths, it converges rapidly. Most time is spent in BGP and CEF
processes. I assume this is due to the router computing the bestpath
for what its learned so far, then doing it again, and again, etc. I
thought BGP read-only mode was supposed to help, but I can't find much
on it.
Thanks
Chris
Rodney Dunn wrote:
> Get a sniffer trace when you clear the session.
>
> It's a very hard problem to debug without extensive work because
> it could be in so many places.
>
> You could run a debug ip packet against an ACL for the peers to
> match up with the sniffer trace. That would eliminate CoPP if you
> see all the packets in the debug that are in the trace.
>
> Or span the port going to the RP and compare to the trace (I forgot
> how to do that).
>
> On Tue, Mar 13, 2007 at 07:41:17PM +0200, Emanuel Popa wrote:
>> We can clear the bgp session only tomorrow morning when traffic level
>> is pretty low. This means 14 hours from now. We will monitor SPD drops
>> in the morning but i don't think we are going to notice anything
>> interesting.
>>
>> Regarding tcp stats, do you mean:
>>
>> br01.frankfurt#sh tcp stat
>> Rcvd: 71476208 Total, 2530 no port
>> 385 checksum error, 18 bad offset, 0 too short
>> 44865801 packets (1625121834 bytes) in sequence
>> 1113216 dup packets (38655517 bytes)
>> 982 partially dup packets (341189 bytes)
>> 153829 out-of-order packets (131849235 bytes)
>> 2 packets (1882 bytes) with data after window
>> 145 packets after close
>> 1 window probe packets, 73202 window update packets
>> 3955 dup ack packets, 0 ack packets with unsend data
>> 24945059 ack packets (1360941754 bytes)
>> Sent: 71782281 Total, 1 urgent packets
>> 2023467 control packets (including 1014567 retransmitted)
>> 25824879 data packets (1360984359 bytes)
>> 287631 data packets (19095511 bytes) retransmitted
>> 244 data packets (93857 bytes) fastretransmitted
>> 43188396 ack only packets (38453293 delayed)
>> 7 window probe packets, 457732 window update packets
>> 337116 Connections initiated, 4909 connections accepted, 3852
>> connections established
>> 342321 Connections closed (including 946 dropped, 336762 embryonic dropped)
>> 1302198 Total rxmt timeout, 0 connections dropped in rxmt timeout
>> 99 Keepalive timeout, 9488 keepalive probe, 0 Connections dropped in keepalive
>>
>> Both peers changed everything on their ends: equipment, vendor,
>> interface etc. One of them changed from Juniper to Cisco and this
>> becomes pretty confusing. It would be a hell of a coincidence that
>> they both have the same problem with the config towards our machine.
>> I'm positive that the issue is generated on our gear. I just don't
>> know how to deal with it. Me and my colleagues have tried everything.
>> Now we are waiting for the case to reach cisco TAC.
>>
>> Good evening,
>> Emanuel
>>
>>
>> On 3/13/07, Oliver Boehmer (oboehmer) <oboehmer at cisco.com> wrote:
>>> Can you find out if you indeed see any SPD drops when you converge, or
>>> if those SPD drops where from something else (i.e. Internet background
>>> noise or something like this).
>>> But I don't think this is an input/SPD drop issue, if you had this
>>> problem, you would have noticed it with 2x1GE already.
>>> Can you check the TCP stats at both sides? Did your peer change
>>> something on his end except the interface? It's really weird.
>>>
>>> oli
>>>
>>> Emanuel Popa <mailto:emanuel.popa at gmail.com> wrote on Tuesday, March 13,
>>> 2007 6:03 PM:
>>>
>>>> the headromm has the default value.
>>>>
>>>> br01.frankfurt#sh ip spd
>>>> Current mode: normal.
>>>> Queue min/max thresholds: 73/74, Headroom: 1000, Extended Headroom: 10
>>>> IP normal queue: 1, priority queue: 0.
>>>> SPD special drop mode: none
>>>>
>>>> please tell me in what scenario whould your commands help me with my
>>>> issue?
>>>>
>>>> regards,
>>>> emanuel
>>>>
>>>> On 3/13/07, Oliver Boehmer (oboehmer) <oboehmer at cisco.com> wrote:
>>>>> Emanuel Popa <> wrote on Tuesday, March 13, 2007 3:33 PM:
>>>>>
>>>>>> Ytti,
>>>>>>
>>>>>> Here is the output:
>>>>>> br01.frankfurt#sh int te 10/3 | i Input queue
>>>>>> Input queue: 0/75/109/109 (size/max/drops/flushes); Total output
>>>>>> drops: 0
>>>>>>
>>>>>> But:
>>>>>>
>>>>>> - routing protocol packets are not dropped when default hold queue
>>>>>> of 75 is full; they are considered priority packets and they are
>>>>>> dropped after headroom of 1000 is full; please see
>>>>>>
>>> http://www.cisco.com/en/US/products/hw/routers/ps167/products_tech_note0
>>>>> 9186a008012fb87.shtml
>>>>>> for more details
>>>>>>
>>>>> how's your headroom? What does "show spd" tell you?
>>>>>
>>>>> ip spd queue max-threshold 999
>>>>> ip spd queue min-threshold 998
>>>>>
>>>>> might help..
>>>>>
>>>>> oli
>> _______________________________________________
>> cisco-nsp mailing list cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
--
Chris Griffin cgriffin at ufl.edu
Sr. Network Engineer - CCNP Phone: (352) 392-2061
CNS - Network Services Fax: (352) 392-9440
University of Florida/FLR Gainesville, FL 32611
More information about the cisco-nsp
mailing list