[c-nsp] Unique issue which is not making any sense, maybe not even Cisco related...

Sun Mar 2 21:47:55 EST 2014

We've seen similar sounding problems before. A couple of suggestions for
things to look at:

The default F5 config is auto last hop ie. it will forward to the MAC
address of the SVI from which the traffic originated not the HSRP for
return traffic by default. Look for potential dynamic return path issues or
you could turn off auto last hop - I wouldn't suggest turning it off
though, but ymmv.

F5's afaik run\participate in MST0 by default. I've experienced similar
symptoms post MST0 changes in the past. We disabled MST on the F5 units (on
a stick) in question to resolve the problem.

F5 and arp, vs linux devices with multiple interfaces:

The default arp_announce value for linux is '0' in
/proc/sys/net/ipv4/conf/eth0/arp_announce, F5 appears to implement along
the lines of '1'. AFAIK you can't adjust the F5 implementation - at least
as of TMOS 11.2. Resolution has been to adjusting linux hosts to a value of
"1".

http://kb.linuxvirtualserver.org/wiki/Using_arp_announce/arp_ignore_to_disable_ARP

http://lxr.linux.no/#linux+v2.6.32.24/Documentation/networking/ip-sysctl.txt#L766

arp_announce - INTEGER 770
<http://lxr.linux.no/linux+*/Documentation/networking/ip-sysctl.txt#L770>
       0 - (default) Use any local address, configured on any
interface 771 <http://lxr.linux.no/linux+*/Documentation/networking/ip-sysctl.txt#L771>
       1 - Try to avoid local addresses that are not in the tar

On 3 March 2014 13:03, Blake Pfankuch - Mailing List <
blake.mailinglist at pfankuch.me> wrote:

> First off please excuse if some of this does not make sense... I am
> working on a 48 hour day, and only got about a 2 hour nap so far...
>
> I currently have 2 Cisco 6500 switches configured as the layer 3 core
> within my production network.  It has been configured with about 30 SVI's
> in a single VRF.  Each SVI has an HSRP Version 1 configuration.  We are
> going through a project, migrating to Nexus 7000 devices at the Layer 3
> core, as well as replacing our aging 6500's and 4500's with Nexus 5500
> servers switches for client connectivity.  Everything has been going
> exceptionally well, however I have 1 oddity which caused a production
> outage today, in such a way I have never seen.  Currently all SVI's fall
> into a single VRF on the Nexus Core as well.
>
> Last night I migrated 6 SVI's from 6500 to Nexus.  All of them worked
> exactly as expected, except for 1 network, and specifically 1 pair of
> devices.  Configured on this VLAN I have a pair of F5 load balancers.
>  These load balancers exist in 6 different networks.  They appeared to have
> issues only in this single network.  All these cisco devices are behind a
> firewall.  There is an "Edge" Network in place.  The legacy 6500's are
> configured in this network as .2 and .3 with .1 as an hsrp ip.  The new
> nexus equipment is configured as .5 and .6 with .2 as an hsrp ip.  The
> firewall is .11.
>
> Each of the 4 cisco devices have a default route pointed to .11.  There is
> no Dynamic routing at this point past EIGRP between all 4 devices
> redistributing connected subnets.  The 6500's do not support VSS, so they
> are standalone devices.  The Nexus devices are configured in a VPC domain.
>  Uplinks through the production network are doublesided VPC's from the
> Nexus 7000 core to Nexus 5000 Distribution.  I am migrating from HSRP
> version 1 to HSRP version 2 to allow for more HSRP instances in the future.
>  I have a large number of additional networks that need spun up soon, and I
> figured I would do it right the first time...
>
> This is where I am having trouble.  This network is fully integrated and
> has been working for about 2 months without any issue.  About 75% of our
> network and server infrastructure is already migrated onto the Nexus
> infrastructure, including several layer 3 FHRP configurations.  Here is a
> snip of the existing 6500 config.
>
> ## 6500 Core 01
>
> interface Vlan44
> ip address 192.168.44.3 255.255.252.0
> no ip redirects
> no ip unreachables
> no ip proxy-arp
> standby 83 ip 192.168.44.1
> standby 83 timers 1 3
> standby 83 priority 125
> standby 83 preempt
> standby 83 track 1 decrement 50
> arp timeout 240
> end
>
> ## 6500 Core 02
>
> interface Vlan44
> ip address 192.168.44.4 255.255.252.0
> no ip redirects
> no ip unreachables
> no ip proxy-arp
> standby 83 ip 192.168.44.1
> standby 83 timers 1 3
> standby 83 priority 90
> standby 83 preempt
> standby 83 track 1 decrement 50
> arp timeout 240
> end
>
> The new nexus configuration lines out very similarly.  This does not
> include the hsrp track configuration as of yet.  We are changing a large
> amount of the topology, and I did not implement it this evening as I did
> not want anything unexpected popping up.
>
> ## Nexus Core 01
>
> interface Vlan44
>   no ip redirects
>   ip address 192.168.44.3/22
>   hsrp version 2
>   hsrp 44
>     preempt
>     priority 125
>     timers  1  3
>
> ## Nexus Core 02
>
> interface Vlan44
>   no ip redirects
>   ip address 192.168.44.4/22
>   hsrp version 2
>   hsrp 44
>     preempt
>     priority 90
>     timers  1  3
>
> Like I said, this is where it gets weird.  When I move from 6500, to Nexus
> everything looks fine, except for the pair of load balancers.  They are
> configured as 192.168.44.35 and 192.168.44.36 with about 90 VIP's through
> the network.  When on Nexus for HSRP, traffic from all vlan's on Nexus pass
> traffic properly to the load balancer VIP's with the exception of traffic
> sourcing from the Edge VLAN.  Either from other devices in that network, or
> from behind the firewall in that network.  Here is where it gets really
> weird... some devices have functional access.  I have 2 workstations on my
> desk, one identified as .16 and the other as .17,  one of them can get to
> the F5 devices.  The other cannot.  Itested from about 10 other points, and
> they are about 50/50 for functionality.  Again this is only 2 devices.
>  Packet captures from them show they appear to be seeing both physical
> MAC's and the HSRP mac when connecting to the HSRP vip, and tripping up on
> it.  I think...  Nothing els!
>  e seems to be having issues with it, just these 2 devices...
>
> I am in the process of replacing these devices with a solution from
> another vendor, but I am at least 3 months from completion.  Any thoughts
> on this or suggestions of where to look past hsrp states, arp and mac
> tables?  If additional information is required, please let me know...
>
> Thanks,
> Blake
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>