[c-nsp] help on NAT rate limiting

Church, Chuck cchurch at netcogov.com
Wed Dec 29 09:03:58 EST 2004


Ted,

	Sorry, the per host limiting is a 12.3T feature that was
discussed about a month ago:
https://puck.nether.net/pipermail/cisco-nsp/2004-November/014524.html
That's the one I was thinking about, and probably what you're looking
for.
	Limiting each host to say 50 or 100 connections should probably
suffice for most purposes.  Those ones with 24 hour timeouts seem pretty
high though.  Try the 5 - 20 minute range like you mentioned. 


Chuck Church
Lead Design Engineer
CCIE #8776, MCNE, MCSE
Netco Government Services - Design & Implementation Team
1210 N. Parker Rd.
Greenville, SC 29609
Home office: 864-335-9473
Cell: 703-819-3495
cchurch at netcogov.com
PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x4371A48D 


-----Original Message-----
From: Ted Mittelstaedt [mailto:tedm at toybox.placo.com] 
Sent: Wednesday, December 29, 2004 1:09 AM
To: Church, Chuck; cisco-nsp at puck.nether.net
Subject: RE: [c-nsp] help on NAT rate limiting



> -----Original Message-----
> From: Church, Chuck [mailto:cchurch at netcogov.com]
> Sent: Tuesday, December 28, 2004 10:42 AM
> To: Ted Mittelstaedt; cisco-nsp at puck.nether.net
> Subject: RE: [c-nsp] help on NAT rate limiting
>
>
> Ted,
>
> 	I think the intention of NAT limiting is to limit each internal
> host to 'x' number of translations, so that infected hosts can't
create
> thousands of entries and consume all the memory, etc.

That's not how the global command:

ip nat translation max-entries 50000

works.  All hosts share the same translation pool.

> I don't think
> it's supposed to be a single threshold for all internal devices.

Sorry but that is how it works.  Yes, you can get fancy with
access lists but the basic command is a global.

> Haven't played with it, so I'm not totally sure.  Regarding the
> timeouts, I've had good results with:
> ip nat translation timeout 150
         ^^^^^^^^^^^^^^^^^^^^^^^
The manual explicitly states that the "ip nat translation timeout"
command does NOT apply to overloading.  See the following:

http://www.cisco.com/univercd/cc/td/doc/product/software/ios123/123cgcr/
ipra
s_r/ip1_i2g.htm#wp1080144

> ip nat translation tcp-timeout 300
> ip nat translation udp-timeout 120
> ip nat translation finrst-timeout 120
> ip nat translation syn-timeout 300
> ip nat translation dns-timeout 15
> ip nat translation icmp-timeout 10
>
> 	This is on a 1720, by the way.  Regarding the NAT timeout issue
> and overloading, it should work.

The manual isn't clear if these apply to overloading or not.

> What does 'sh ip nat tra ver' tell you
> as far as the countdown timers are concerned?

What it shows is thousands of tcp translation entries that are
persistent
for up to 24 hours.  I never get this on any routers running IOS
versions
under 12.3.  On routers running IOS 12.2 and lower in similar situations
there will be a handful of 24 hour tcp connections but the rest are
like yours, with minutes left.

That is why I would like to know what exactly Cisco did with these
translation timers in 12.3.  If in reality the IOS 12.2 and lower used
24 hours for their tcp translation timeouts, then what we have is
a big fat bug in IOS 12.3 - in short, the translator is losing or
missing most of the TCP close commands that are being passed through
the connections.

>
> Are you saying in 12.3 these timers are much longer?
>

The manual claims the following defaults in 12.3:

timeout: 86400 seconds (24 hours)
udp-timeout: 300 seconds (5 minutes)
dns-timeout: 60 seconds (1 minute)
tcp-timeout: 86400 seconds (24 hours)  24 hours!!!!
finrst-timeout: 60 seconds (1 minute)
icmp-timeout: 60 seconds (1 minute)
pptp-timeout: 86400 seconds (24 hours)
syn-timeout: 60 seconds (1 minute)
port-timeout: 0 (never)

I'll try a

ip nat translation tcp-timeout 1200

command and see if it applies to overloading and fixes the problem.
But this is just an absurd situation.  TCP connections, unless
initiated from a virus, have definite tcp connection closes that
are passed in virtually all situations.  The translator in 12.3
should be looking at these and when it sees a close or reset
come through it should tear down the tcp translation entry, not
leave it laying around for 24 hours!!!

24 hours is really pretty absurd anyhow.  Even more so
because very very few tcp protocols that have long sequences where
they don't send data don't use keepalives.  About
the only thing I can think of that this might help is if someone is
running some crusty old Telnet daemon that doesen't issue keepalives.
All other tcp protocols I can think of will open a connection
and almost all of the time the connection is open they will be
passing data back and forth - and if there is a possibility they
won't, they will use keepalives.

The only other thing I can think might possibly be the explanation
here is that there's a bug in IOS 12.3 where the translator is
missing some of the tcp connection close commands.  But, this is
IOS 12.3.12 and I've had this same problem on earler 12.3 versions
on a 1600 series at a different customer - I switched back to 12.1
on that one.  I find it hard to believe such a glaring bug (which
basically makes nat useless in a default configuration) would not
have been found after twelve iterations of IOS!!!

Ted

> Chuck Church
> Lead Design Engineer
> CCIE #8776, MCNE, MCSE
> Netco Government Services - Design & Implementation Team
> 1210 N. Parker Rd.
> Greenville, SC 29609
> Home office: 864-335-9473
> Cell: 703-819-3495
> cchurch at netcogov.com
> PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x4371A48D
>
>
> -----Original Message-----
> From: cisco-nsp-bounces at puck.nether.net
> [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Ted
Mittelstaedt
> Sent: Tuesday, December 28, 2004 1:02 PM
> To: cisco-nsp at puck.nether.net
> Subject: [c-nsp] help on NAT rate limiting
>
> Hi All,
>
>   We have a customer that's a small office about 20 people
> behind a 1720.  The router is configured to overload on to
> a single IP address, and has a vpn to another 1720 coming
> in to it.
>
>   They wanted another ethernet interface in this so we put a
> wic-1enet card into the router - this required going to 12.3
> ios to support the hardware and that is when all hell broke loose.
>
>   previous to 12.3 the ios had no way to rate limit nat -
> normally the translation table would run about a couple hundred
> entries.  Every once in a while they would get a virus and
> the table would balloon - which would be simple to see by
> showing the nat translation table, finding the offending inside
> ip address, and removing the virus, the table would go back to
> normal.  They were running 12.1 on that 1700 for a year at
> least with no other problems.
>
>   Now with 12.3 there is a way to rate limit nat - but the
> people at Cisco that thought this was a good idea
> quite obviously figured they would -raise- all the timeouts
> in the translator.  So now, even without a virus, the router
> will run on average of 20,000 translation entries sometimes.
>
>   configuring rate limiting to wack off the table at 2-3 thousand
> entries creates a situation where the router simply runs up
> the translation table to the limit, then stops creating new
> entries.
>
>   We want to reset the timeouts in ios back to what they
> were rather than trying to wack the table off at it's knees -
> but there is no info I can find on the Cisco website as to
> what the SENSIBLE timeouts were that were used in 12.1, 12.0,
> etc.  And furthermore the ios commands that are available for
> reducing the timeouts don't apply to overloads - which of course
> is what everything on this router is.
>
>   Going back to an old IOS is not possible because of the
> ethernet wic.
>
>   Whoever did this at Cisco obviously never heard of the
> axiom "if it ain't broke don't fix it".  A nat rate-limiting
> command is an impossibility - a virus will use all available
> ram in the router for translation entries no matter how high
> or how low the limit is set - and will just max out the translation
> slots with the rate-limit set, and the router stops working,
> so this command gains nothing.  And to put a command like this
> in and use it as a license to raise the timeouts which is what
> it seems they have done is absurd.
>
>   No doubt Cisco was besiged with idiots trying to press wussy-assed
> routers into service as translators for fortune 100 companies -
> they should have told those morons to go pound sand and buy a pix
> and left the translation code for the small routers alone, it was
> working fine before.  Changing the translator operation in 12.3
> has screwed it for everyone else I think.
>
>   Please someone, tell me the documentation is wrong and that the
> nat timeout commands do apply to overloads!
>
> Ted
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>




More information about the cisco-nsp mailing list