[c-nsp] ARP behavior with HSRP and static NAT

Fri Oct 5 06:25:54 EDT 2012

IOS SNAT is EOL and is in the process of being deprecated:

http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6586/ps6640
/end_of_life_notice_c51-611706.html

Eventually, it'll only be available as a feature on firewalls

-----Original Message-----
From: cisco-nsp-bounces at puck.nether.net
[mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Anton Kapela
Sent: 05 October 2012 02:46
To: evan at kisbey.net
Cc: cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] ARP behavior with HSRP and static NAT

On Thu, Oct 4, 2012 at 4:17 PM,  <evan at kisbey.net> wrote:

> The routers involved have HSRP on both the WAN and the LAN-side
> interfaces, and NAT across the pair with identical NAT statements on
each.
>   To force a full failover in case link is lost on a single interface,
> there's a track running for each HSRP interface on its opposite (LAN
> versus WAN) on the primary router, decrementing its priority and
letting
> the standby router know that it's time to preempt when the primary
goes

[snip]

I've tried as much as you describe, and never got it to work right.
Having de-sync'd nat state tables is never any fun, ever -- for either
inbound and outbound originated sessions. I wouldn't recommend anyone
roll two autonomous hosts doing NAT in such a fashion.

I'd recommend checking out something else, IOS SNAT. We've used this
in active-active configs, with routing protocols, bfd, etc. cranked
up, and it was mostly great. Details here:

http://www.cisco.com/en/US/products/sw/iosswrel/ps1839/products_white_pa
per09186a0080118b04.shtml
http://www.cisco.com/en/US/docs/ios/12_3t/12_3t7/feature/guide/gtsnatay.
html
http://www.cisco.com/en/US/docs/ios/12_4/12_4_mainline/snatsca.html

In my use case, we did HSRP floating addr facing 'ISP' side, and
originated 0/0 via various protocols towards 'inside' gear/links/etc
-- we did not use HSRP in any capacity facing the 'inside.' There's no
real issue in doing hsrp on inside + outside, but it's jankier than it
needs to be. If your ISP can do ebgp/private AS stuff, and let you
originate a given bit of address space, I'd strongly suggest that
ahead of HSRP at all.

One takeaway from our lab/test work is worth special mention: the SNAT
state sync traffic seems to have higher cpu priority than HSRP, but
not higher than BGP, OSPF, and BFD. That is, if one had to 'order' the
relative CPU priority, it looked like: bfd, ospf, bgp, snat, hsrp --
which is kinda 'eh.' We tested 15.0, 15.1M and T, and 15.2T on a broad
set of hardware (isr 2800/2900 g1's, g2's, npe-g1, and the 7201).

Net result -- when 'flow dense' (i.e. icmp/tcp/etc scan the entire
internet, etc) or other abusive levels of state-inducing traffic was
sourced from test systems on the 'inside,' the SNAT replication
activity would consume appropriate CPU, but would block reception and
processing of HSRP helos between the active/standby routers.

As all the IGP's would stay up, everything looked ok 'inside,' and so
the routing topology was stable.

Of course, bouncing HSRP active/standby events facing the 'ISP'
outside network had a pretty horrible result. Durring such abuse
tests, here would be rolling/cycling instability while both routers
'claimed' they were 'the active master' towards the ISP gear; this
caused the usual nonsense one might see with >1 host claiming ARP
responses for a given layer 3 address. YMMV, AMFYOY, etc.

Some relief was found with stuff like the slightly misleading name of
"Rate Limiting NAT Translation" -- it's not a RATE at all, just a
simple state limit of 'max translations:' allowed for a given "inside"
or "outside" source IP:

http://www.cisco.com/en/US/docs/ios/12_3t/12_3t4/feature/guide/gt_natrl.
html#wp1027129

In practice, a limit of a few tens of k flows per source IP kept
things reasonably stable under high-rate nat churn.

Perhaps if there were a "nat table miss packets per second per source
IP" knob (like a microflow exceptions policer in CoPP or a policy
map), we'd have had better luck under abusive workloads with SNAT, but
alas, we're not offering to pay for one, and it would seem nobody else
has yet.

All in all, for the typical case, SNAT is pretty great -- nothing
breaks with a link/box/route is down, and nobody has to know things
migrated between border devices as such.

Best,

-Tk
_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

This email has been swept by Webroot for viruses. Any files transmitted with it are confidential and intended solely for the email recipient. If you are not the intended recipient please delete this email immediately. Be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this email in error please notify the system administrator. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Finally, the recipient should check this email and any attachments for the presence of viruses.

GCI Com incorporates the following Group Companies:
GCI Telecom Group Limited Reg. No. 5396496, Edge Telecommunications Ltd Reg. No. 5748740, Edge Telecom Ltd Reg. No. 3101247, IP Infrastructures Ltd Reg. No. 4657026, Invomo Ltd Reg. No. 6267056, NetServices UK Ltd Reg. No. 7118768, WAN Services Ltd Reg. No. 4082862. All Registered in England and Wales, Registered Office: Global House, 2 Crofton Close, Lincoln, LN3 4NT