[c-nsp] Unique issue which is not making any sense, maybe not even Cisco related...

Blake Pfankuch - Mailing List blake.mailinglist at pfankuch.me
Mon Mar 3 11:49:10 EST 2014


Just an update.  Working with Cisco, we confirmed that the F5 was directing traffic for the HSRP VIP to the MAC address of the SVI.  We implemented peer-gateway and this resolved the issue.

Thanks to all for their suggestion, on and off list,
Blake

-----Original Message-----
From: Vitkovský Adam [mailto:adam.vitkovsky at swan.sk] 
Sent: Monday, March 3, 2014 1:35 AM
To: Blake Pfankuch - Mailing List; Mick O'Rourke
Cc: cisco-nsp at puck.nether.net
Subject: RE: [c-nsp] Unique issue which is not making any sense, maybe not even Cisco related...

You said you reused the ip address: 

> .2 and .3 with .1 as an hsrp ip
> The new nexus equipment is configured as .5 and .6 with .2 as an hsrp ip


It happens sometimes that the old MACs get stuck on the end-hosts. 
Coupled with the F5 using SVI mac instead of the virtual IP MAC this can cause some wired behavior. 
Reload helps. 

adam
-----Original Message-----
From: cisco-nsp [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Blake Pfankuch - Mailing List
Sent: Monday, March 03, 2014 4:39 AM
To: Mick O'Rourke
Cc: cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] Unique issue which is not making any sense, maybe not even Cisco related...

If only we were on 11.x.  Based on our hardware, we are limited to 10.2.4 HF4.  We are not using MST.  I will have a look at the auto last hop functions.  I have confirmed that all HSRP instances are assigned consistently so that active is on Core 01 and Standby is on Core 02.

Thanks for the suggestions Mick,
Blake

From: Mick O'Rourke [mailto:mkorourke at gmail.com]
Sent: Sunday, March 2, 2014 7:48 PM
To: Blake Pfankuch - Mailing List
Cc: cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] Unique issue which is not making any sense, maybe not even Cisco related...

We've seen similar sounding problems before. A couple of suggestions for things to look at:

The default F5 config is auto last hop ie. it will forward to the MAC address of the SVI from which the traffic originated not the HSRP for return traffic by default. Look for potential dynamic return path issues or you could turn off auto last hop - I wouldn't suggest turning it off though, but ymmv.

F5's afaik run\participate in MST0 by default. I've experienced similar symptoms post MST0 changes in the past. We disabled MST on the F5 units (on a stick) in question to resolve the problem.

F5 and arp, vs linux devices with multiple interfaces:

The default arp_announce value for linux is '0' in /proc/sys/net/ipv4/conf/eth0/arp_announce, F5 appears to implement along the lines of '1'. AFAIK you can't adjust the F5 implementation - at least as of TMOS 11.2. Resolution has been to adjusting linux hosts to a value of "1".

http://kb.linuxvirtualserver.org/wiki/Using_arp_announce/arp_ignore_to_disable_ARP

http://lxr.linux.no/#linux+v2.6.32.24/Documentation/networking/ip-sysctl.txt#L766

arp_announce - INTEGER

 770<http://lxr.linux.no/linux+*/Documentation/networking/ip-sysctl.txt#L770>        0 - (default) Use any local address, configured on any interface

 771<http://lxr.linux.no/linux+*/Documentation/networking/ip-sysctl.txt#L771>        1 - Try to avoid local addresses that are not in the tar

On 3 March 2014 13:03, Blake Pfankuch - Mailing List <blake.mailinglist at pfankuch.me<mailto:blake.mailinglist at pfankuch.me>> wrote:
First off please excuse if some of this does not make sense... I am working on a 48 hour day, and only got about a 2 hour nap so far...

I currently have 2 Cisco 6500 switches configured as the layer 3 core within my production network.  It has been configured with about 30 SVI's in a single VRF.  Each SVI has an HSRP Version 1 configuration.  We are going through a project, migrating to Nexus 7000 devices at the Layer 3 core, as well as replacing our aging 6500's and 4500's with Nexus 5500 servers switches for client connectivity.  Everything has been going exceptionally well, however I have 1 oddity which caused a production outage today, in such a way I have never seen.  Currently all SVI's fall into a single VRF on the Nexus Core as well.

Last night I migrated 6 SVI's from 6500 to Nexus.  All of them worked exactly as expected, except for 1 network, and specifically 1 pair of devices.  Configured on this VLAN I have a pair of F5 load balancers.  These load balancers exist in 6 different networks.  They appeared to have issues only in this single network.  All these cisco devices are behind a firewall.  There is an "Edge" Network in place.  The legacy 6500's are configured in this network as .2 and .3 with .1 as an hsrp ip.  The new nexus equipment is configured as .5 and .6 with .2 as an hsrp ip.  The firewall is .11.

Each of the 4 cisco devices have a default route pointed to .11.  There is no Dynamic routing at this point past EIGRP between all 4 devices redistributing connected subnets.  The 6500's do not support VSS, so they are standalone devices.  The Nexus devices are configured in a VPC domain.  Uplinks through the production network are doublesided VPC's from the Nexus 7000 core to Nexus 5000 Distribution.  I am migrating from HSRP version 1 to HSRP version 2 to allow for more HSRP instances in the future.  I have a large number of additional networks that need spun up soon, and I figured I would do it right the first time...

This is where I am having trouble.  This network is fully integrated and has been working for about 2 months without any issue.  About 75% of our network and server infrastructure is already migrated onto the Nexus infrastructure, including several layer 3 FHRP configurations.  Here is a snip of the existing 6500 config.

## 6500 Core 01

interface Vlan44
ip address 192.168.44.3 255.255.252.0
no ip redirects
no ip unreachables
no ip proxy-arp
standby 83 ip 192.168.44.1
standby 83 timers 1 3
standby 83 priority 125
standby 83 preempt
standby 83 track 1 decrement 50
arp timeout 240
end

## 6500 Core 02

interface Vlan44
ip address 192.168.44.4 255.255.252.0
no ip redirects
no ip unreachables
no ip proxy-arp
standby 83 ip 192.168.44.1
standby 83 timers 1 3
standby 83 priority 90
standby 83 preempt
standby 83 track 1 decrement 50
arp timeout 240
end

The new nexus configuration lines out very similarly.  This does not include the hsrp track configuration as of yet.  We are changing a large amount of the topology, and I did not implement it this evening as I did not want anything unexpected popping up.

## Nexus Core 01

interface Vlan44
  no ip redirects
  ip address 192.168.44.3/22<http://192.168.44.3/22>
  hsrp version 2
  hsrp 44
    preempt
    priority 125
    timers  1  3

## Nexus Core 02

interface Vlan44
  no ip redirects
  ip address 192.168.44.4/22<http://192.168.44.4/22>
  hsrp version 2
  hsrp 44
    preempt
    priority 90
    timers  1  3

Like I said, this is where it gets weird.  When I move from 6500, to Nexus everything looks fine, except for the pair of load balancers.  They are configured as 192.168.44.35 and 192.168.44.36 with about 90 VIP's through the network.  When on Nexus for HSRP, traffic from all vlan's on Nexus pass traffic properly to the load balancer VIP's with the exception of traffic sourcing from the Edge VLAN.  Either from other devices in that network, or from behind the firewall in that network.  Here is where it gets really weird... some devices have functional access.  I have 2 workstations on my desk, one identified as .16 and the other as .17,  one of them can get to the F5 devices.  The other cannot.  Itested from about 10 other points, and they are about 50/50 for functionality.  Again this is only 2 devices.  Packet captures from them show they appear to be seeing both physical MAC's and the HSRP mac when connecting to the HSRP vip, and tripping up on it.  I think...  Nothing els!
 e seems to be having issues with it, just these 2 devices...

I am in the process of replacing these devices with a solution from another vendor, but I am at least 3 months from completion.  Any thoughts on this or suggestions of where to look past hsrp states, arp and mac tables?  If additional information is required, please let me know...

Thanks,
Blake
_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net<mailto:cisco-nsp at puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/



More information about the cisco-nsp mailing list