[f-nsp] Setup a WatchGuard Active-Active firecluster on FastIron SX-800 and/or Super-X

Robert Toth rtoth at iperceptions.com
Thu Oct 20 03:57:43 EDT 2011


Hi Raja,

 

Thanks for the quick answer...  I do have some insights here with a new
question:

 

First of all, was your NLB cluster setup as multicast, IGMP multicast,
or unicast?  The reason I ask is that we are using several NLB clusters
setup for UNICAST (as we have dedicated NIC's for NLB in each server,
plus one for normal LAN activities), and it's working just fine.  No
need for any static ARP or MAC in this situation from what we've seen.
Either that, or I'm not seeing something here that I need to look into?
We are NOT using the switch for routing any NLB traffic, that is handled
solely by our firewalls.  

 

We run easily into the 30/70K packets per second on the clusters, and
they all seem to be getting their fair share of load..  

 

That being said, I can confirm that STATIC ARP on the SX's have to be
associated to a single port (you can't enter it unless you specify one
and only one port on the switch).  Which is kind of dumb actually.  'arp
{static entry #} {IP address} {MAC address} eth {PORT/SLOT}' on an
SX-800 or Super-X.  You can NOT get away from NOT SPECIFYING the 'eth
SLOT/PORT' parameter.

 

On a Cisco - you simply do an 'arp 192.168.0.1 xxxx.xxxx.xxxx arpa'  to
set the static arp ..  then add your static mac to the port(s) you need.

Same goes for a Extreme Summit -  'configured iparp add 192.168.0.1
xx:xx:xx:xx:xx:xx'  to set the static arp... then add the static MAC to
each port.

 

I can't understand why the Foundry would force the ARP to bind to a PORT
on a specific MAC..   On a specific VLAN I could understand (broadcast
domain makes sense).  

 

Inside of a vlan, I can set a static arp inspection  'arp 192.168.0.1
xxxx.xxxx.xxxx inspect'   but that doesn't really set a static arp
entry.

 

Over-simplifying it a bit:  IP address  goes to ARP lookup which returns
a MAC address.  MAC lookup returns which port(s) it's been seen on /
been bound to.  Send out the packet for that IP to the port(s) you
found.  Foundry's method circumvents the MAC lookup by forcing the ARP
lookup to do BOTH (saving a MAC lookup).  But then that prevents the box
from playing nice with NLB or other redundant / load-sharing devices
that it's plugged into.  That's a big functional hole in my view.

 

Anyways, if anyone from Foundry/Brocade looks at this post, and can shed
some light as to why this is done instead of how some of the other big
boys do it, I'd be really interested in hearing the answer..

 

Raja,  thanks for the heads-up on the 07.2.02d  release... I'll be
upgrading soon - I tend to follow the "If it ain't broke, don't fix it"
routine..  But I gather from your comment that something is INDEED
BROKE..  so it needs fixing...  Correct?

 

 

I'd seriously like to see what IronWare 07.3.xx  will bring - and if the
static ARP functionality will now play nice with others and let our
redundant / load-sharing solutions just work the way they should...

 

Best regards,

 

Robert

 

From: Raja Subramanian [mailto:rajasuperman at gmail.com] 
Sent: Wednesday, October 19, 2011 6:56 AM
To: Robert Toth
Cc: foundry-nsp at puck.nether.net
Subject: Re: [f-nsp] Setup a WatchGuard Active-Active firecluster on
FastIron SX-800 and/or Super-X

 

Hi,

We had exactly the same problem with NLB on SX and FCX. All traffic
would go to a single server, and the setup was never stable enough for
production.

After a Brocade TAC case which went on for nearly 12 months, we
concluded that multi port static ARP is not supported on the FastIron
platform. Only multi port static MAC is supported.

You need to run L2 code on FastIron to get NLB, etc to work correctly.

We are hanging dual FCXs stacked and running L2 code. These are
connected redundantly to dual SX800 L3 cores. Servers, firewalls, etc
are connected to the FCX stack.

Suggest you run 7202d on your SX boxes. Don't run any other 7.x code
release at this time.

- Raja

On Oct 19, 2011 2:14 PM, "Robert Toth" <rtoth at iperceptions.com> wrote:

Hi everyone,

 

Was just referred to this list, hopefully somebody can help me with a
very stubborn problem here..

 

I have been fighting for a few months now to get my WatchGuard
Firecluster's up and running with stability on both a Super-X and
SX-800.  Here's a quick overview:

 

Datacenter:  SX-800 with V07.2.00 Full Layer 3 software, redundant mgmt
& fabric, and eight  424C modules.

This is our core switch that handles all network connectivity to our
datacenter servers.

 

1 Layer-3 segment (VLAN-9) that is primary gateway for INTERNAL remote
network routing across a 2GB dedicated fiber trunk to head office.

Multiple VLAN's used to partition both external and internal network  as
Layer-2 segments, with firewalls being used to manage and provide
controls between them as needed (FastIron does NOT participate in
Internal/External access).

 

Fireboxes are the Peak 5500e series, with Fireware XTM Pro V11.3.4 -
setup for Firecluster H/A support for load-sharing and redundancy.
These ports on firebox are given MULTICAST MAC addresses, with UNICAST
IP's.

Firewall ports are ALL plugged into various Layer-2 segments.

 

On the primary trusted network, as well as every other Layer-2 segment
(VLAN) the Firebox is the default gateway - however, there is a Layer-3
router on the SX-800 for that one segment that provides an alternate
lower-cost route to distant internal networks via RIP so that internal
traffic is sent directly across the 2GB trunk line instead of being sent
through the firewall at slower speed (and higher load on the firewalls
themselves handling inter-office traffic via VPN's).  ACL's were in
place to limit and control who had access across that link, but those
were removed to faciliate debugging for the moment.

 

Head office:  Similar to the above, but using a Super-X instead of the
SX-800, and a pair of Firebox Core 1250e's - same release of software on
both sets of devices.

 

The instructions from Watchguard are clear:  We need to setup Static ARP
and MAC addresses as needed to correctly direct traffic to/from the
switches and fireboxes.  The samples given are for a Cisco 3750 and an
Extreme Summit 15040  (ref:  
http://www.watchguard.com/help/docs/wsm/11/en-us/content/en-us/ha/cluste
r_example_cisco_wsm.html)  .  This is what I have tried to replicate to
the best I can but have had no luck with setting the static arp address
on the switch outside of the single layer-3 segment we have, and to set
it on more than one port.  Setting the multicast mac is fine for both
ports needed to support the pair of firewalls by interface )(ex:
static-mac xxx.xxxx.xxxx eth 1/yy eth 2/yy  -  works fine on every Vlan
I need it on).

 

Digging further into Google (which ended up pointing me at this list), I
found this article about "hairpinning" to correctly support a specific
setup of Microsoft NLB clusters (which we also use extensively at the
datacenter) here:  
https://puck.nether.net/pipermail/foundry-nsp/2010-June/002498.html

While I haven't found need to do this for the NLB clusters themselves as
they appear to be working perfectly (up to 5 physical servers per
cluster running IIS7 & Win2K8), it appeared to be the best solution to
adapt for the Firecluster problem...  And getting a static arp assigned
to multiple ports.

 

Anyways, long story short - it's not working well.  It seems that the
traffic is being forced from port to port by only having the ARP/MAC
entry recognized on a single port at a time (and being kicked around
constantly).    Apparently, if this was a Cisco - this would work
brilliantly..  But because we can't map static ARP's as we should,  the
switch is constantly moving things around instead of sending it out to
both ports at the same time for each interface.  

 

Has ANYONE seen or tried this kind of setup using similar hardware ?  If
so, did you get it working correctly?  Am I on the right track with the
hairpinning solution?

 

Is Brocade making any changes to Ironware to provide similar
functionality so that I can get this working properly any time soon, or
will I have to resort to a couple of dumb Layer-2 switches that
apparently won't need any configuration and will just work, and cobble
the whole thing together into a massive mess to get it all tied
together?  Are there any issues with WatchGuard Firecluster that I need
to know about that make it work differently than documented, and prevent
the FastIron's from being able to cope or perform as expected  ( I
realize that last question isn't really a Foundry question... My
apologies, but I hope that someone else with Fireboxes has seen this
problem on Foundry or perhaps found another solution using other
hardware if the FastIron's or other Foundry products just couldn't do
what was needed.  This assumes that FireCluster will work given the
right infrastructure setup - maybe THAT assumption is wrong?  If so, I'd
gladly listen to a solution that DOES work?)

 

Best regards,

 

Robert Toth

Director of Information Technology Services

iPerceptions Inc

Tel: 514.488.3600 ext. [284] | 877.796.3600 ext. [284]

Fax: 514.484.2600

London - Montreal - New York - Toronto

[rtoth at iperceptions.com] |  www.iperceptions.com

 

Subscribe to our blog: blog.iperceptions.com
<http://blog.iperceptions.com/> 

Follow us on Twitter: www.twitter.com/iperceptions

Register for our upcoming webinar "Is the check engine light 'on' for
your analytics" at https://www1.gotomeeting.com/register/501539032

 

The contents of this email and any attachments are confidential. They
are intended for the named recipient(s) only and are not to be
communicated to anyone else without permission. If you have received
this email in error please notify the sender immediately and do not make
copies or disclose the contents to anyone.

 

 

 


_______________________________________________
foundry-nsp mailing list
foundry-nsp at puck.nether.net
http://puck.nether.net/mailman/listinfo/foundry-nsp

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/foundry-nsp/attachments/20111020/6bd1328a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 5279 bytes
Desc: image001.png
URL: <https://puck.nether.net/pipermail/foundry-nsp/attachments/20111020/6bd1328a/attachment.png>


More information about the foundry-nsp mailing list