[f-nsp] FWSX eating arp packets

Josh Galvez josh at zevlag.com
Thu May 28 22:17:31 EDT 2015


I've seen ASIC failures in a lot of my old FESX's recently.  Each failure
has had it's own unique set of features.

I'd try moving the cross connect to a different tower (port group, ASIC) on
both switches to rule out ASIC failure.

On Thu, May 28, 2015 at 5:52 PM, Randy McAnally <rsm at fast-serv.com> wrote:

> Hi all, sorry for the long winded post but this has been eating away at
> me.   Feel free to reply on or off list.
>
> Everything was fine for almost 2 years then out of the blue, a near
> complete black hole occurs with traffic between two FWSX switches.  In case
> you aren't aware, FWSX are just regular FESX switches neutered so they
> can't be upgraded with a PREM layer3 license.   Here's a diagram:
>
>         -----------xc1-----------[FWSX 1]--[server1]
>         |                           |
> [upstream switch]                   xc3
>         |                           |
>         -----------xc2-----------[FWSX 2]--[server2]
>
> Both FWSX's are pure layer2 and form a 802.1w loop with xc2 the blocking
> link.   No frills, bells or whistles.
>
> After many hours of tcpdumping on servers connected to a pair of FWSX
> (basic layer2) switches, it turns out ARP unicast packets are being dropped
> by the x-connect between two switches but only in one direction.   Below,
> you'll see the unicast reply to the initial broadcast, but subsequent
> unicast pings are dropped (thus only a single reply using arping).
>
> Traffic between two servers - SERVER 1 (switch1) to SERVER 2 (switch2):
> [root at cl-ash-s1 ~]# arping -I eth1.2 10.11.13.11
> ARPING 10.11.13.11 from 10.11.13.5 eth1.2
> Unicast reply from 10.11.13.11 [A0:36:9F:0E:13:B2]  2.453ms
> Sent 11 probes (1 broadcast(s))
> Received 1 response(s)
>
> And on the other server - SERVER 2 (switch2) to SERVER 1 (switch1):
> [root at localhost ~]# arping 10.11.13.5 -I xenbr2
> ARPING 10.11.13.5 from 10.11.13.11 xenbr2
> Sent 11 probes (11 broadcast(s))
> Received 0 response(s)
>
>
> In a nutshell -- Unicast ARP from server1 to server2 is completely
> dropped.   Broadcast works in both directions, and unicast works only from
> server2 to server1.
>
> MAC tables on the FWSX's are sane.   Every server is shown where it should
> be.
>
> Can reproduce this with any device or operating system.   It's definitely
> NOT a problem with the host configuration(s).
>
> Now the kicker - if I remove the x-connect between the switches (and let
> spanning tree re-converge through the upstream switch both are connected
> to), things work normally.  Tried swapping xc3 to different ports, no
> change.  So as long as I boomerang inter-switch traffic through the
> upstream switch, we're good.   Which is quite a bit, actually -- including
> SAN traffic -- I need to avoid this.   Reboot both switches.   No change.
>  Software is latest for the platform (05.1.00eT1e0) so I can't try
> upgrading.
>
> So my simple question is -- has anyone ever seen brocade switches (in pure
> L2 duties) just straight up eat arp packets?   And not only that -- but
> JUST unicast arp and in only one direction?
>
>
> --
> Randy McAnally
> _______________________________________________
> foundry-nsp mailing list
> foundry-nsp at puck.nether.net
> http://puck.nether.net/mailman/listinfo/foundry-nsp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/foundry-nsp/attachments/20150528/838b5a18/attachment.html>


More information about the foundry-nsp mailing list