[c-nsp] ARP-cache Timeouts for ASA5520
Fawcett Simon
Simon.Fawcett at uk.fujitsu.com
Wed Jun 11 04:38:05 EDT 2008
Hi,
I would suggest that you use two linux boxes with keepalived, similar to vrrp, with nagios check and Layer 4 loadbalancing . I have previously used it to reliably loadbalance 20 proxies, a mail cluser and dns cluster.
The best tip I can give you is to use the broadcom gigabit cards with say ubuntu LTS 8.04, we have found the E1000 less reliable.
During failure, the load balancer that takes over sends a gratuitous arp to its upstream router, if the next upstream is an ASA, it will be considered an attack.
As Alasdair states drop the arp timeout to 60 seconds or put some routers / L3 switch infront of the loadbalancers
I hope this helps
Simon
-----Original Message-----
From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Alasdair Gow
Sent: 11 June 2008 09:00
To: Casey, J Bart
Cc: cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] ARP-cache Timeouts for ASA5520
Hi,
Where are you doing the load balancing?
I assume that because you are using the ASA's in Active/Standby then it's not the ASA's, therefore it would be some other device/s.
In that case how is the load balancing working in the other devices? do the load balancers have individual mac addresses or have a floating mac address? or the secondary load balancer take the mac of the first when it fails (similar to how the ASA's can do failover), or even load balance by rewriting packets?
I think 4 hours is unrealistic, one would expect the device with the mac address that could fail would be ping monitored and at which time debug would be carried and the arp cache could be cleared manually. (I know that's not the point)
The arp timeout is easily changeable, documentation from the ASA's on device documentation says:
"ARP Timeout-Sets the amount of time before the security appliance rebuilds the ARP table, between 60 to 4294967 seconds. The default is 14400 seconds. Rebuilding the ARP table automatically updates new host information and removes old host information. You might want to reduce the timeout because the host information changes frequently. Although this parameter appears on the ARP Static Table panel, the timeout applies to the /dynamic/ ARP table."
I suspect this value is the most convenient for the majority of users but not yourself.
Are you able to lab your setup? perhaps you can test the impact on changing the arp cache, or simulate a load balancer failure to see how the ASA's react.
Kind regards,
Alasdair
Casey, J Bart wrote:
> We are looking at the possibility of purchasing load-balancers for our
> web servers(2) and mail gateways(2). Unfortunately, we don't have a
> lot of money to throw at this solution and are therefore looking at
> the most economic solutions available. As a result, one of our
> options is what I consider to be for a small office and not
> necessarily for an enterprise environment. As a result, there are a
> few features that I consider to be lacking. One in particular is the
> ability to make a pair of load-balancers highly-available.
>
>
>
> The documentation from the manufacturer states that failover in an
> ARP-cache to time out. In our case, the default ARP-cache timeout for
> our ASA5520s is 14400 seconds or 4 hours. That would mean if one of
> the load-balancers failed, it could take up to 4 hours before the ASA
> begins to forward packets to the backup device. In my mind, if we are
> trying to be "highly-available", this is unacceptable. However, I
> understand that this value was most likely arrived at as a result of
> testing and is really more of a best-practice.
>
>
>
> I called my SE to get recommendations/suggestions. He was very
> helpful in answering my questions and confirming my thoughts that
> lowering that timeout would most likely increase CPU load and if the
> CPU load increased enough would potentially affect the stability of
> the ASA and thereby the stability of any network which depends on that
> device. I asked for his advice based on experience for lowering that
> timeout and he mentioned 5000 seconds or approximately 1 hour 23
> minutes (from a previous implementation). This is better but still
> not in an acceptable range for something that's supposed to be "highly-available".
>
>
>
> I have been following the thread titled "Gratuitous ARP and PIX" and
> it seems like David is wrestling with some of the same type issues
> that I am. The only difference is that he mentions that his ARP-cache
> is set to time out at 5 minutes which seems very low but maybe
> appropriate for his environment. Like David, I am hesitant to lower the value.
> However, I'm just curious what the thoughts are from the others on
> this list about how far I can push that value down.
>
>
>
> Here are the facts:
>
>
>
> 1. I am running a pair of 5520s in active/standby with stateful
> failover.
>
> 2. Devices are running with a single context for now. I am
> considering multiple contexts in the future.
>
> 3. The average bandwidth through the device is about 34Mbps with a
> slated increase of 12Mbps/year over the next 4 years (When the devices
> will be replaced). Total approaching approximately 84Mbps.
>
> 4. This particular pair of devices does not have any service
> modules
>
> 5. There's no VPN taking place on this pair except for one group
> configured for emergency purposes. Obviously, that will go away if we
> go to multiple contexts.
>
> 6. There are currently 4 VLANs trunked to one interface(4
> sub-interfaces), 1 outside interface, 1 inside interface, 1 LAN
> failover interface and 1 State failover interface.
>
> 7. There are about 80 ACL lines, however, almost every single line
> references a Network Object Group and a Service Group. So, in truth,
> there are probably a few hundred ACL lines.
>
> 8. There are 130+ NAT Statements
>
> 9. These devices also run OSPF on the outside and inside
> interfaces (2 OSPF areas) with about 6 peers.
>
> 10. The current CPU utilization is 9% (approximately 15Mbps and 70
> connections/second). It's the summer and the majority of students
> aren't on campus and therefore, bandwidth utilization is down. I also
> don't have any history on the CPU utilization for high-traffic time.
>
> 11. The ARP-cache timeout is set to the default of 14400
>
> 12. There are currently 34 entries in the ARP table
>
>
>
> So, all of that being said, I welcome the thoughts of the members of
> this list with regard to adjusting the ARP-cache timeout.
>
>
>
> Thank you in advance for your help.
>
>
>
> J. Bart Casey
>
> Network Engineer
>
> Wofford College
>
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
--
Alasdair Gow
Lumison
t: 0845 1199 900
d: 0131 514 4042
P.S. Do you love Lumison?
If so, please help us get nominated in the 2008 PC Pro Awards by completing a short questionnaire:
http://www.demographix.com/surveys/TWHI-SO67/7R6Z87KY/?ms
In completing the survey, you get a chance to win one of £3,500 worth of prizes, including a laptop, two satnavs or a 22inch widescreen monitor!
--
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender. Any offers or quotation of service are subject to formal specification.
Errors and omissions excepted. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of Lumison and nPlusOne.
Finally, the recipient should check this email and any attachments for the presence of viruses. Lumison and nPlusOne accept no liability for any damage caused by any virus transmitted by this email.
_______________________________________________
cisco-nsp mailing list cisco-nsp at puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
More information about the cisco-nsp
mailing list