[c-nsp] 2851 and full BGP
Paul Cosgrove
paul.cosgrove at heanet.ie
Mon Aug 11 10:40:20 EDT 2008
Hi Jay,
PMTUD is not working here. You can see from the command output that a
TCP MSS of 536 bytes is being used rather than the expected 1440 bytes:
> Datagrams (max data segment is 536 bytes):
This limits the size of BGP packets, requiring more to be sent and so
increasing the load on the routers.
You seem to have PTMUD enabled at both ends of the link, so perhaps
there is filtering taking place which is stopping the required ICMP
messages (or a bug as Chuck suggested).
I don't know if this is the cause of your main issues, but I would fix
that and then see if the issue is resolved.
Paul.
Jay Nakamura wrote:
> To answer couple people's questions,
>
> MTU on the routers are 1500. I have tested with ping and df-bit set.
> Provider has higher frame size to cover that MTU over the WAN link and our
> switches that connects to them on both ends have higher frame size. (1526
> frame size or higher)
>
> While I am at it, I noticed 12.4 line IOS for 28xx is MD release. Which,
> cisco's link doesn't tell you what that means. I know GD, ED, etc releases
> but wasn't sure what MD relase meant. Mainline deployment?
>
> Anyway, is 12.4 the most stable way to go on 28xx? We are not using any
> fancy features. One router is using NM-1T3/E3 card but that's about it.
>
> Here are some output from both routers while exchanging just internal
> routes.
>
> border2-col#sh ip bgp neighbors Y.Y.Y.Y
> BGP neighbor is Y.Y.Y.Y, remote AS ZZZZ, internal link
> BGP version 4, remote router ID Y.Y.Y.Y
> BGP state = Established, up for 3d03h
> Last read 00:00:41, last write 00:00:49, hold time is 180, keepalive
> interval is 60 seconds
> Neighbor capabilities:
> Route refresh: advertised and received(new)
> Address family IPv4 Unicast: advertised and received
> Message statistics:
> InQ depth is 0
> OutQ depth is 0
>
> Sent Rcvd
> Opens: 7 7
> Notifications: 3 1
> Updates: 171196 105628
> Keepalives: 4581 4586
> Route Refresh: 0 0
> Total: 175787 110226
> Default minimum time between advertisement runs is 0 seconds
>
> For address family: IPv4 Unicast
> BGP table version 887105, neighbor version 887105/0
> Output queue size : 0
> Index 3, Offset 0, Mask 0x8
> 3 update-group member
> Inbound soft reconfiguration allowed
> Outgoing update prefix filter list is COLUMBUS_NET
> Sent Rcvd
> Prefix activity: ---- ----
> Prefixes Current: 7 9 (Consumes 468 bytes)
> Prefixes Total: 8 9
> Implicit Withdraw: 0 0
> Explicit Withdraw: 1 0
> Used as bestpath: n/a 9
> Used as multipath: n/a 0
>
> Outbound Inbound
> Local Policy Denied Prefixes: -------- -------
> prefix-list 535265 0
> Bestpath from this peer: 9 n/a
> Total: 535274 0
> Number of NLRIs in the update sent: max 1024, min 0
>
> Address tracking is enabled, the RIB does have a route to Y.Y.Y.Y
> Connections established 7; dropped 6
> Last reset 3d03h, due to BGP Notification received, illegal header length
> Transport(tcp) path-mtu-discovery is enabled
> Connection state is ESTAB, I/O status: 1, unread input bytes: 0
> Connection is ECN Disabled, Mininum incoming TTL 0, Outgoing TTL 255
> Local host: X.X.X.X, Local port: 51918
> Foreign host: Y.Y.Y.Y, Foreign port: 179
> Connection tableid (VRF): 0
>
> Enqueued packets for retransmit: 0, input: 0 mis-ordered: 0 (0 bytes)
>
> Event Timers (current time is 0x15C86EE0):
> Timer Starts Wakeups Next
> Retrans 4563 31 0x0
> TimeWait 0 0 0x0
> AckHold 4529 4183 0x0
> SendWnd 0 0 0x0
> KeepAlive 0 0 0x0
> GiveUp 0 0 0x0
> PmtuAger 1 1 0x0
> DeadWait 0 0 0x0
> Linger 0 0 0x0
> ProcessQ 0 0 0x0
>
> iss: 3264861958 snduna: 3264948248 sndnxt: 3264948248 sndwnd: 16023
> irs: 3518332904 rcvnxt: 3518419120 rcvwnd: 16118 delrcvwnd: 266
>
> SRTT: 301 ms, RTTO: 308 ms, RTV: 7 ms, KRTT: 0 ms
> minRTT: 4 ms, maxRTT: 2824 ms, ACK hold: 200 ms
> Status Flags: active open
> Option Flags: nagle, path mtu capable
> IP Precedence value : 6
>
> Datagrams (max data segment is 536 bytes):
> Rcvd: 8963 (out of order: 0), with data: 4530, total data bytes: 86215
> Sent: 8919 (retransmit: 31, fastretransmit: 0, partialack: 0, Second
> Congestion: 0), with data: 4532, total data bytes: 86289
> Packets received in fast path: 0, fast processed: 0, slow path: 0
> fast lock acquisition failures: 0, slow path: 0
>
>
> border2-indy#sh ip bgp neighbors X.X.X.X
> BGP neighbor is X.X.X.X, remote AS ZZZZ, internal link
> BGP version 4, remote router ID X.X.X.X
> BGP state = Established, up for 3d04h
> Last read 00:00:39, last write 00:00:31, hold time is 180, keepalive
> interval is 60 seconds
> Neighbor capabilities:
> Route refresh: advertised and received(new)
> Address family IPv4 Unicast: advertised and received
> Message statistics:
> InQ depth is 0
> OutQ depth is 0
>
> Sent Rcvd
> Opens: 9 9
> Notifications: 1 4
> Updates: 144559 224571
> Keepalives: 4590 4585
> Route Refresh: 0 0
> Total: 149155 229172
> Default minimum time between advertisement runs is 0 seconds
>
> For address family: IPv4 Unicast
> BGP table version 2377206, neighbor version 2377206/0
> Output queue size : 0
> Index 2, Offset 0, Mask 0x4
> 2 update-group member
> Inbound soft reconfiguration allowed
> Outgoing update prefix filter list is INDY_NET
> Sent Rcvd
> Prefix activity: ---- ----
> Prefixes Current: 9 7 (Consumes 364 bytes)
> Prefixes Total: 9 8
> Implicit Withdraw: 0 0
> Explicit Withdraw: 0 1
> Used as bestpath: n/a 7
> Used as multipath: n/a 0
>
> Outbound Inbound
> Local Policy Denied Prefixes: -------- -------
> prefix-list 458047 0
> Bestpath from this peer: 9 n/a
> Total: 458056 0
> Number of NLRIs in the update sent: max 1135, min 0
>
> Address tracking is enabled, the RIB does have a route to X.X.X.X
> Connections established 9; dropped 8
> Last reset 3d04h, due to BGP Notification sent, illegal header length
> Transport(tcp) path-mtu-discovery is enabled
> Connection state is ESTAB, I/O status: 1, unread input bytes: 0
> Connection is ECN Disabled, Mininum incoming TTL 0, Outgoing TTL 255
> Local host: Y.Y.Y.Y, Local port: 179
> Foreign host: X.X.X.X, Foreign port: 51918
> Connection tableid (VRF): 0
>
> Enqueued packets for retransmit: 0, input: 0 mis-ordered: 0 (0 bytes)
>
> Event Timers (current time is 0x10A0F458):
> Timer Starts Wakeups Next
> Retrans 4578 46 0x0
> TimeWait 0 0 0x0
> AckHold 4532 4200 0x0
> SendWnd 0 0 0x0
> KeepAlive 0 0 0x0
> GiveUp 0 0 0x0
> PmtuAger 0 0 0x0
> DeadWait 0 0 0x0
> Linger 0 0 0x0
> ProcessQ 0 0 0x0
>
> iss: 3518332904 snduna: 3518419158 sndnxt: 3518419158 sndwnd: 16080
> irs: 3264861958 rcvnxt: 3264948267 rcvwnd: 16004 delrcvwnd: 380
>
> SRTT: 304 ms, RTTO: 335 ms, RTV: 31 ms, KRTT: 0 ms
> minRTT: 4 ms, maxRTT: 468 ms, ACK hold: 200 ms
> Status Flags: passive open, gen tcbs
> Option Flags: nagle, path mtu capable
> IP Precedence value : 6
>
> Datagrams (max data segment is 536 bytes):
> Rcvd: 8953 (out of order: 0), with data: 4533, total data bytes: 86308
> Sent: 8920 (retransmit: 46, fastretransmit: 0, partialack: 0, Second
> Congestion: 0), with data: 4532, total data bytes: 86253
> Packets received in fast path: 0, fast processed: 0, slow path: 0
> fast lock acquisition failures: 0, slow path: 0
>
> On Mon, Aug 11, 2008 at 9:18 AM, Church, Charles <cchurc05 at harris.com>wrote:
>
>> Oh, yeah. Sorry, I didn't catch the 'WAN' part of it the first time.
>> That does make MTU a possibility. But didn't he get like 20% of his
>> routes before the error message? Since it was 12.4(20)T (pretty
>> bleeding edge), I'd lean towards that still. I'd think that an MTU
>> problem would show up way before you got to 20%. Does BGP set the DF
>> bit?
>>
>> Chuck
>>
>> -----Original Message-----
>> From: Paul Cosgrove [mailto:paul.cosgrove at heanet.ie]
>> Sent: Monday, August 11, 2008 4:33 AM
>> To: Church, Charles
>> Cc: mtinka at globaltransit.net; cisco-nsp at puck.nether.net
>> Subject: Re: [c-nsp] 2851 and full BGP
>>
>>
>> Hi Chuck,
>>
>> Jay will be able to clarify, but I took the following to mean that the
>> two are separated via third party infrastructure: "two 2851s connected
>> to each other over gigabit Ethernet WAN".
>>
>> May well be a bug though.
>>
>> Paul.
>>
>> Church, Charles wrote:
>>> Wasn't the original problem the iBGP connection over his own network?
>> Sounds like a bug more than anything else.
>>> Chuck
>>>
>>> ----- Original Message -----
>>> From: cisco-nsp-bounces at puck.nether.net
>> <cisco-nsp-bounces at puck.nether.net>
>>> To: mtinka at globaltransit.net <mtinka at globaltransit.net>
>>> Cc: cisco-nsp at puck.nether.net <cisco-nsp at puck.nether.net>
>>> Sent: Sun Aug 10 15:52:03 2008
>>> Subject: Re: [c-nsp] 2851 and full BGP
>>>
>>>
>>> Keep in mind that if the peerings are not between directly connected
>> IP,
>>> disabling PMTUd for BGP will cause it to use an MSS of 536 bytes.
>>>
>>> You could check the achievable MTU using extended pings with the DF
>> bit
>>> set, and compare it with the segment size listed by BGP before you
>>> decide whether to make that change.
>>>
>>> Paul.
>>>
>>> Mark Tinka wrote:
>>>> On Saturday 09 August 2008 10:28:40 Jay Nakamura wrote:
>>>>
>>>>
>>>>> Any ideas on what could be causing this issue? Is there
>>>>> a better IOS version to use?
>>>>>
>>>> Sounds like an MTU issue.
>>>>
>>>> Try disabling TCP PMTUd for BGP and see if that helps:
>>>>
>>>> router bgp 1234
>>>> no bgp transport path-mtu-discovery
>>>>
>>>> If that works, consider checking with your provider on the
>>>> supported MTU, end-to-end, and adjust your interface MTU if
>>>> it helps.
>>>>
>>>> Cheers,
>>>>
>>>> Mark.
>>>>
>>>>
>> ------------------------------------------------------------------------
>>>> _______________________________________________
>>>> cisco-nsp mailing list cisco-nsp at puck.nether.net
>>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>> _______________________________________________
>>> cisco-nsp mailing list cisco-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>> --
>> HEAnet Limited
>> Ireland's Education & Research Network
>> 5 George's Dock, IFSC, Dublin 1, Ireland
>> Tel: +353.1.6609040
>> Web: http://www.heanet.ie
>> Company registered in Ireland: 275301
>>
>> Please consider the environment before printing this e-mail.
>> _______________________________________________
>> cisco-nsp mailing list cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
--
HEAnet Limited
Ireland's Education & Research Network
5 George's Dock, IFSC, Dublin 1, Ireland
Tel: +353.1.6609040
Web: http://www.heanet.ie
Company registered in Ireland: 275301
Please consider the environment before printing this e-mail.
More information about the cisco-nsp
mailing list