[c-nsp] N3K: "VPC peer keep-alive receive has failed"
Garrett Skjelstad
garrett at skjelstad.org
Thu Dec 27 06:18:21 EST 2018
Different VPC domains, yes?
On Thu, Dec 27, 2018, 02:58 Manuel Guesdon <ml+cisco-nsp at oxymium.net wrote:
> Hi,
>
> I have a strange problem with Nexus N3K and QinQ tunnel.
>
>
> I've configured 2 Nexus 3064 with VPC. It works well for monthes.
>
> Recently I've added a port-channel in dot1q-tunnel mode (the 1st one in
> this
> mode).
> Since that I have this message:
> "%$ VDC-1 %$ %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 1, VPC peer
> keep-alive receive has failed" multiple times a day on the 2 switches.
>
> Details:
> BIOS: version 4.1.0
> NXOS: version 7.0(3)I6(1)
>
> new interface & port-channel:
>
> interface Ethernet1/35
> switchport mode dot1q-tunnel
> switchport access vlan 72
> spanning-tree port type edge
> speed 10000
> channel-group 1035
>
> interface port-channel1035
> switchport mode dot1q-tunnel
> switchport access vlan 72
> speed 10000
> vpc 1035
>
> A "sh vlan id 72" only report peer-link ports/portchannels and
> eth1/35 / po1035.
>
> There's no other end for the moment for this tunnel.
>
> Message appear on various time on each switch (i.e. not at the same time
> on both switches) and not the same number of time per day. For exemple
> today: 3 on a switch, 6 on the other one.
>
> Switches load seems the same than before this new port channel and
> there's
> no load pic around the message date/time (cacti 5mn measures)
>
> When I shut the port, messages no more appear. When I re-enable it they
> come back.
>
> I've tried changing keep alive parameters:
> --Keepalive interval : 500 msec
> --Keepalive timeout : 10 seconds
> --Keepalive hold timeout : 6 seconds
> but same thing.
>
> Keepalive link is on a dedicated 2 ports port-channel, IPs are set
> directly on the portchannel, in a VRF.
>
> 1st switch:
> vpc domain 1
> role priority 1
> peer-keepalive destination 10.0.6.3 source 10.0.6.2 vrf pkal \
> interval 500 time out 10 hold-timeout 6
> peer-gateway
> auto-recovery
> ipv6 nd synchronize
> ip arp synchronize
>
> 2nd switch:
> vpc domain 1
> role priority 2
> peer-keepalive destination 10.0.6.2 source 10.0.6.3 vrf pkal \
> interval 500 time out 10 hold-timeout 6
> peer-gateway
> auto-recovery
> ipv6 nd synchronize
> ip arp synchronize
>
>
> There's nothing in logs accept the "receive has failed" message.
>
> There's no error on keep-alive interfaces.
>
> On cacti, I just notice a little drop of outgoing traffic for keep-alive
> ports around message apparition so it seems it's not a receive problem
> but
> a transmit problem.
>
> If a configure 2 others N3K with same configuration (Back-to-Back
> configuration) for the other end of the tunnel and propagate vlan 72
> toward
> them, I start having the same message on the other switches, even if the
> QinQ port on them is down. If I stop propagating vlan toward them,
> message stop on these 2 switches (but continue on the first 2 switches).
>
> Any idea ???
>
>
>
> Manuel
>
> --
> ______________________________________________________________________
> Manuel Guesdon - OXYMIUM
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
More information about the cisco-nsp
mailing list