[f-nsp] FWSX eating arp packets

Randy McAnally rsm at fast-serv.com
Thu May 28 19:52:07 EDT 2015


Hi all, sorry for the long winded post but this has been eating away at 
me.   Feel free to reply on or off list.

Everything was fine for almost 2 years then out of the blue, a near 
complete black hole occurs with traffic between two FWSX switches.  In 
case you aren't aware, FWSX are just regular FESX switches neutered so 
they can't be upgraded with a PREM layer3 license.   Here's a diagram:

         -----------xc1-----------[FWSX 1]--[server1]
         |                           |
[upstream switch]                   xc3
         |                           |
         -----------xc2-----------[FWSX 2]--[server2]

Both FWSX's are pure layer2 and form a 802.1w loop with xc2 the blocking 
link.   No frills, bells or whistles.

After many hours of tcpdumping on servers connected to a pair of FWSX 
(basic layer2) switches, it turns out ARP unicast packets are being 
dropped by the x-connect between two switches but only in one direction. 
   Below, you'll see the unicast reply to the initial broadcast, but 
subsequent unicast pings are dropped (thus only a single reply using 
arping).

Traffic between two servers - SERVER 1 (switch1) to SERVER 2 (switch2):
[root at cl-ash-s1 ~]# arping -I eth1.2 10.11.13.11
ARPING 10.11.13.11 from 10.11.13.5 eth1.2
Unicast reply from 10.11.13.11 [A0:36:9F:0E:13:B2]  2.453ms
Sent 11 probes (1 broadcast(s))
Received 1 response(s)

And on the other server - SERVER 2 (switch2) to SERVER 1 (switch1):
[root at localhost ~]# arping 10.11.13.5 -I xenbr2
ARPING 10.11.13.5 from 10.11.13.11 xenbr2
Sent 11 probes (11 broadcast(s))
Received 0 response(s)


In a nutshell -- Unicast ARP from server1 to server2 is completely 
dropped.   Broadcast works in both directions, and unicast works only 
from server2 to server1.

MAC tables on the FWSX's are sane.   Every server is shown where it 
should be.

Can reproduce this with any device or operating system.   It's 
definitely NOT a problem with the host configuration(s).

Now the kicker - if I remove the x-connect between the switches (and let 
spanning tree re-converge through the upstream switch both are connected 
to), things work normally.  Tried swapping xc3 to different ports, no 
change.  So as long as I boomerang inter-switch traffic through the 
upstream switch, we're good.   Which is quite a bit, actually -- 
including SAN traffic -- I need to avoid this.   Reboot both switches.   
No change.   Software is latest for the platform (05.1.00eT1e0) so I 
can't try upgrading.

So my simple question is -- has anyone ever seen brocade switches (in 
pure L2 duties) just straight up eat arp packets?   And not only that -- 
but JUST unicast arp and in only one direction?


-- 
Randy McAnally


More information about the foundry-nsp mailing list