[f-nsp] Netiron AS4 capabilities
Bogdan Rotariu
bogdan at rotariu.ro
Thu Jun 29 13:59:31 EDT 2023
Yes, I did do that so I can filter AS-PATHS, but too many. I just activated again. For not spamming the list too much added just some of the output,, the full output is here: https://pastie.dev/CdnSr8.yaml
[29.06.2023, 8:53:34,297 PM] Jun 29 20:53:34.351 BGP: Incoming TCP connection. peer 10.11.1.15 OKed
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: Rcv incoming TCP connection UP. handle a001143a:1b7fadf4, key 0
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 Connection Collision, connection_up=0
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 Accept incoming TCP connection from peer, local IP 10.11.1.4
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 TCP Connection opened
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 sending MultiProtocol cap, afi/safi=1/1, length 4
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 sending 4-octet ASN cap, asn=56430, length 4
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 fbit is 0, for AFI/SAFI 1/1
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 sending Graceful Restart cap, rbit 0, time 120, length 6
[29.06.2023, 8:53:34,299 PM] Jun 29 20:53:34.352 BGP: 10.11.1.15 sending OPEN, My asn=56430 holdTime=90 route_refresh=1 cooperative= 1, restart 1/0
[29.06.2023, 8:53:34,300 PM] Jun 29 20:53:34.355 BGP: 10.11.1.15 rcv OPEN w/Option parameter length 20, My asn 56430, hold_time 180
[29.06.2023, 8:53:34,300 PM] Jun 29 20:53:34.355 BGP: 10.11.1.15 rcv OPEN w/Option parameter length 20
[29.06.2023, 8:53:34,300 PM] Jun 29 20:53:34.355 BGP: 10.11.1.15 rcv capability 2, len 0
[29.06.2023, 8:53:34,300 PM] Jun 29 20:53:34.355 BGP: 10.11.1.15 rcv 4-octet ASN capability 65, len 4, asn=56430,
[29.06.2023, 8:53:34,300 PM] Jun 29 20:53:34.355 BGP: 10.11.1.15 rcv MP_EXT capability 1, len 4, afi/safi=1/1
[29.06.2023, 8:53:34,300 PM] Jun 29 20:53:34.355 BGP: 10.11.1.15 rcv Graceful Restart capability 64, len 2, rbit 0, time 0
[29.06.2023, 8:53:34,301 PM] Jun 29 20:53:34.357 BGP: 10.11.1.15 Peer went to ESTABLISHED state
[29.06.2023, 8:53:35,495 PM] Jun 29 20:53:35.549 BGP: 10.11.1.15 received invalid AGGREGATOR attribute flag (0xd0)
[29.06.2023, 8:53:35,495 PM] Jun 29 20:53:35.549 BGP: 10.11.1.15 received invalid AGGREGATOR attribute flag (0xd0)
[29.06.2023, 8:53:35,495 PM] Jun 29 20:53:35.549 BGP: 10.11.1.15 sending NOTIFICATION 3/4 (Attribute Flags Error)
[29.06.2023, 8:53:35,495 PM] Jun 29 20:53:35.549 BGP: 10.11.1.15 reset due to BGP notification sent
[29.06.2023, 8:53:35,496 PM] Jun 29 20:53:35.549 BGP: 10.11.1.15 Closing TCP connection 0x00000002
[29.06.2023, 8:53:35,496 PM] Jun 29 20:53:35.550 BGP: 10.11.1.15 BGP connection closed
[29.06.2023, 8:53:35,496 PM] Jun 29 20:53:35.550 BGP: 10.11.1.15 Peer went to IDLE state (Attribute Flags Error)
[29.06.2023, 8:53:35,496 PM] Jun 29 20:53:35.550 BGP: 10.11.1.15 Peer already in IDLE state, stays in IDLE state.
[29.06.2023, 8:53:35,496 PM] Jun 29 20:53:35.550 BGP: Attribute Error: BGP: 10.11.1.15 rcv UPDATE w/attr: Origin=IGP AS_PATH= AS_SEQ(2) 8708 5606 44418 44418 44418 NextHop=10.11.1.15 LOCAL_PREF=100 ATOMIC_AGGREGATE COMMUNITY=8708:100
[29.06.2023, 8:53:35,544 PM] Jun 29 20:53:35.600 BGP: 10.11.1.15 RIB_out peer reset #RIB_out 0 (safi 0)
> On 29 Jun 2023, at 19:01, Jörg Kost <jk at ip-clear.de> wrote:
>
> Have you ever debugged the CER and looked at the BGP capability exchange section and updates?
>
> (Caution: CPU spikes may occur when running many sessions)
>
> debug ip bgp neighbor $X
> debug ip bgp general
> debug ip bgp events
> debug ip bgp updates
> debug destination ssh/telnet (some other session window)
> -> no debug all
>
>
> On 29 Jun 2023, at 16:48, Bogdan Rotariu wrote:
>
>> Thanks again Jörg for your interest in this issue!
>>
>> Will answer both questions here, yes, thats me too, I’ve done days of testing as we ordered a bunch of Mikrotik’s and the Mikrotik support keeps quoting me from RFC’s and I almost got convinced that the CER’s are the problem.
>> Operating at least 160 BGP sessions on the CER’s and we did not had any issue before (except memory from time to time). I believe that the issue is related just with RouterOS7, tested from at least 7.3 to 7.10.
>>
>> We have multiple POP’s and each pop has at least one CER2024 due to lack of space and costs of electricity this device is awesome! In some cases we need some locations we need a bunch more of 10G ports and the CCR2216 specs are insane.
>>
>> I have used Mikrotik before but not that much, did not wanted to learn their CLI but in the end it is not that bad.
>>
>> So, regarding our setups, each location has at least 1 CER2024 with 1 Upstream and some peerings (IX, or other local ISP’s) and each location is connected to at least location via OSPF and iBGP (all CER’s support MPLS but we did not use it). We share all the upstream prefixes between the locations.
>>
>> In one location where we have 2 CER2024 we want to replace 1 CER2024 with a CCR2216 and this would be a simple task, but during the testing I’ve found out that it is not that simple due to this issue.
>>
>> Fort testing I have moved 1 peering and 1 internal link from another location to the Mikrotik, did the eBGP with the peering, did OSPF and iBGP with our location, till now everything seems to be OK. Sending all the prefixes received from upstream by location on the CER2024 to Mikrotik works, but when sending the prefixes from that only peering to the CER2024, the session closes with "Error: Invalid AGGREGATOR attribute length 8”.
>>
>> Testing some more so I added the CCR2216 in the middle of two CER’s instead of directly connected to the peers:
>>
>> CER2024 (full table) -> CCR2216 (full table) -> CER2024 - the session closes with Error: Invalid AGGREGATOR attribute length 8
>>
>> I’ve been packet sniffing for some days and see whats going on and I am unable to find out really whats the problem, so creating a setup in GNS3 where I tried to simulate some routers that aggregate prefixes and announce them, unfortunately I was unable to replicate the issue in GNS3, most likely is that I cannot feed the full global table (or I do not know how)
>>
>> Moving along I have created some FRR/Quagga machines and send to them the prefixes from CCR2216, all good, no errors, sending the prefixes that quagga/frr received from CCR2216 to one CER2024, everything works. I have tried with or without as4 capability, there is no difference.
>>
>> Doing some more tests, I have created test iBGP sessions and sent prefixes to a Huawei Router, to a Cisco 7600 and a ASR 1001, all of them handled the prefixes received from the Mikrotik router.
>> The only equipment except the CER’s that we tested that had issues with the BGP session is a stack of Dell 4032F switches that (I think) do not support as4 and their error was:
>>
>> <187> Jun 18 23:58:11 core-stack-2 BGP[BGP Protocol]: bgpattr.c(1628) 955741 %% ERR [VRF ""] Received UPDATE from peer x invalid AGGREGATOR attribute. Aggregator AS is 65052. Aggregator ID is 0.0.0.0. Resetting peer.
>> <187> Jun 19 00:00:16 core-stack-2 BGP[BGP Protocol]: bgpattr.c(1628) 955842 %% ERR [VRF ""] Received UPDATE from peer x invalid AGGREGATOR attribute. Aggregator AS is 25773. Aggregator ID is 0.0.0.0. Resetting peer.
>>
>> Did not test Mikrotik CHR in a VM, will do that test today but I think the outcome is the same.
>>
>> As for testing needs, a upstream or a full table bgp peer to the RouterOS7(CHR VM/HW one) and a Netiron iBGP peer.
>>
More information about the foundry-nsp
mailing list