[f-nsp] FWSX eating arp packets
Randy McAnally
rsm at fast-serv.com
Thu May 28 19:52:07 EDT 2015
Hi all, sorry for the long winded post but this has been eating away at
me. Feel free to reply on or off list.
Everything was fine for almost 2 years then out of the blue, a near
complete black hole occurs with traffic between two FWSX switches. In
case you aren't aware, FWSX are just regular FESX switches neutered so
they can't be upgraded with a PREM layer3 license. Here's a diagram:
-----------xc1-----------[FWSX 1]--[server1]
| |
[upstream switch] xc3
| |
-----------xc2-----------[FWSX 2]--[server2]
Both FWSX's are pure layer2 and form a 802.1w loop with xc2 the blocking
link. No frills, bells or whistles.
After many hours of tcpdumping on servers connected to a pair of FWSX
(basic layer2) switches, it turns out ARP unicast packets are being
dropped by the x-connect between two switches but only in one direction.
Below, you'll see the unicast reply to the initial broadcast, but
subsequent unicast pings are dropped (thus only a single reply using
arping).
Traffic between two servers - SERVER 1 (switch1) to SERVER 2 (switch2):
[root at cl-ash-s1 ~]# arping -I eth1.2 10.11.13.11
ARPING 10.11.13.11 from 10.11.13.5 eth1.2
Unicast reply from 10.11.13.11 [A0:36:9F:0E:13:B2] 2.453ms
Sent 11 probes (1 broadcast(s))
Received 1 response(s)
And on the other server - SERVER 2 (switch2) to SERVER 1 (switch1):
[root at localhost ~]# arping 10.11.13.5 -I xenbr2
ARPING 10.11.13.5 from 10.11.13.11 xenbr2
Sent 11 probes (11 broadcast(s))
Received 0 response(s)
In a nutshell -- Unicast ARP from server1 to server2 is completely
dropped. Broadcast works in both directions, and unicast works only
from server2 to server1.
MAC tables on the FWSX's are sane. Every server is shown where it
should be.
Can reproduce this with any device or operating system. It's
definitely NOT a problem with the host configuration(s).
Now the kicker - if I remove the x-connect between the switches (and let
spanning tree re-converge through the upstream switch both are connected
to), things work normally. Tried swapping xc3 to different ports, no
change. So as long as I boomerang inter-switch traffic through the
upstream switch, we're good. Which is quite a bit, actually --
including SAN traffic -- I need to avoid this. Reboot both switches.
No change. Software is latest for the platform (05.1.00eT1e0) so I
can't try upgrading.
So my simple question is -- has anyone ever seen brocade switches (in
pure L2 duties) just straight up eat arp packets? And not only that --
but JUST unicast arp and in only one direction?
--
Randy McAnally
More information about the foundry-nsp
mailing list