[c-nsp] help on NAT rate limiting
Ted Mittelstaedt
tedm at toybox.placo.com
Wed Dec 29 02:25:38 EST 2004
As I mentioned on the other mail, this problem with the
thousands of TCP translation entries persisting in the
router for up to 24 hours has been present in several
versions of IOS 12.3 that I've tried, in several different
customers and several different routers. (1721 and 1601)
It is present in the most current IOS.
Here is the config used in the 1601 - 4MB dram, 8MB flash - this works
perfectly under
12.1.25 and blows up under 12.3
!
! Last configuration change at 14:11:18 PST Tue Dec 7 2004
! NVRAM config last updated at 14:11:27 PST Tue Dec 7 2004
!
version 12.1
no service single-slot-reload-enable
service timestamps debug uptime
service timestamps log datetime localtime show-timezone
no service password-encryption
!
hostname eatme
!
logging buffered 4096 debugging
no logging console
enable password eatme
!
!
!
!
!
clock timezone PST -8
clock summer-time PDT recurring
ip subnet-zero
ip name-server 1.1.1.1
ip name-server 2.2.2.2
ip dhcp excluded-address 192.168.1.1 192.168.1.99
ip dhcp excluded-address 192.168.1.201 192.168.1.254
!
ip dhcp pool 1
network 192.168.1.0 255.255.255.0
default-router 192.168.1.1
dns-server 192.168.1.2 8.8.8.8
domain-name eatme.local
netbios-node-type m-node
netbios-name-server 192.168.1.2
!
!
!
!
interface Ethernet0
ip address 192.168.1.1 255.255.255.0
ip nat inside
!
interface Serial0
no ip address
encapsulation frame-relay IETF
no keepalive
!
interface Serial0.1 point-to-point
ip address 18.17.4.158 255.255.255.252
ip nat outside
frame-relay interface-dlci 16
!
!
ip nat pool net-18 18.17.42.225 18.17.42.225 netmask 255.255.255.248
ip nat inside source list 1 pool net-18 overload
ip nat inside source static tcp 192.168.1.2 3389 18.17.42.226 3390
extendable
ip nat inside source static tcp 192.168.1.3 3389 18.17.42.226 3389
extendable
ip nat inside source static tcp 192.168.1.2 443 18.17.42.226 443 extendable
ip nat inside source static tcp 192.168.1.2 80 18.17.42.226 80 extendable
ip nat inside source static tcp 192.168.1.2 25 18.17.42.226 25 extendable
ip classless
ip route 0.0.0.0 0.0.0.0 18.17.4.157
no ip http server
!
access-list 1 permit 192.168.1.0 0.0.0.255
!
line con 0
line vty 0 4
password eatme
login
!
sntp server 6.12.8.10
end
Here is the config used on the 1721 - 64MB dram, 32MB flash. There is one
difference in it than prior - the serial interface is replaced with an enet
interface. Under 12.2 and earlier plus the serial interface it was fine,
under 12.3 with
the ethernet interface it blows up:
! No configuration change since last restart
!
version 12.3
service timestamps debug uptime
service timestamps log datetime localtime show-timezone
no service password-encryption
!
hostname eatme2
!
boot-start-marker
boot-end-marker
!
logging buffered 4096 debugging
no logging console
enable password eatme2
!
clock timezone PST -8
clock summer-time PDT recurring
mmi polling-interval 60
no mmi auto-configure
no mmi pvc
mmi snmp-timeout 180
no aaa new-model
ip subnet-zero
no ip source-route
!
!
ip domain name eatme2.com
ip name-server 1.1.1.1
ip name-server 2.2.2.2
ip dhcp excluded-address 10.0.0.1 10.0.0.99
!
ip dhcp pool 1
network 10.0.0.0 255.255.255.0
default-router 10.0.0.1
netbios-node-type h-node
netbios-name-server 10.0.0.3
dns-server 10.0.0.4 3.3.3.3
!
ip cef
ip audit po max-events 100
no ftp-server write-enable
!
!
!
!
!
crypto isakmp policy 11
hash md5
authentication pre-share
crypto isakmp key exhibit-me address 6.16.4.162
!
!
crypto ipsec transform-set eatme2-or esp-des esp-md5-hmac
!
crypto map nolan 11 ipsec-isakmp
set peer 6.16.4.162
set transform-set eatme2-or
match address 100
!
!
!
interface Ethernet0
ip address 65.7.19.60 255.255.255.248 secondary
ip address 65.7.19.61 255.255.255.248 secondary
ip address 65.7.19.58 255.255.255.248
ip nat outside
full-duplex
crypto map nolan
!
interface FastEthernet0
ip address 10.0.0.1 255.255.255.0
ip nat inside
speed auto
!
ip nat inside source static tcp 10.0.0.3 25 interface Ethernet0 25
ip nat inside source static tcp 10.0.0.3 110 interface Ethernet0 110
ip nat inside source static tcp 10.0.0.3 3389 interface Ethernet0 3389
ip nat inside source static tcp 10.0.0.253 20 interface Ethernet0 20
ip nat inside source static tcp 10.0.0.253 21 interface Ethernet0 21
ip nat inside source route-map nonat interface Ethernet0 overload
ip nat inside source static tcp 10.0.0.3 80 65.7.19.61 80 extendable
ip nat inside source static tcp 10.0.0.4 80 65.7.19.60 80 extendable
ip classless
ip route 0.0.0.0 0.0.0.0 65.7.19.57
no ip http server
no ip http secure-server
!
!
access-list 1 permit 10.0.0.0 0.0.0.255
access-list 100 permit ip 10.0.0.0 0.0.0.255 192.168.0.0 0.0.0.255
access-list 110 deny ip 10.0.0.0 0.0.0.255 192.168.0.0 0.0.0.255
access-list 110 permit ip 10.0.0.0 0.0.0.255 any
!
route-map nonat permit 10
match ip address 110
!
!
line con 0
line aux 0
line vty 0 4
password eatme2
login
!
sntp server 8.8.8.8
end
> and any time you change a timer you can find a scenario
> where it isn't the optimal value.
Let me point out that in the past when Cisco changed the default
on IP directed broadcast from a default of ON to a default of OFF,
to shut down the fun of the smurf crowd,
that there was at least one IOS rev where on a blank config, the
statement
no ip directed broadcast
automagically appeared on every interface in the router. Previously
the default was for directed broadcasts to be on - and when Cisco
decided to default them off, you make it so that the user would see
that there change was right there in the config.
This is how you should be doing it if you deviate on the NAT settings
from your prior defaults. If 12.3 extends the tcp settings to 24
hours from 5 hours, then it should show that as a statement that
automatically appears in the config. Don't just spring it on the
user unawares.
And I might also add that the URL referenced while it tells all about
the changes, it DOESEN'T tell what the PREVIOUS defaults were in the
older IOS. NOR does it tell if you did NOT make changes in the
previous defaults.
I liked the old way that the translator worked. Sure, if a customer
got a worm then their router crapped out and knocked them off the
Internet. As a service provider WE LIKE THAT. The last thing I want
is some customer too stupid to run antivirus internally to saturate
his connection to me and impact all my other users with his bazillions
of virus attempts to infect the rest of the world.
The goal should definitely NOT be a goal of keeping the router alive
during the time that a customer is infected and trying to infect the
rest of the world. we WANT him to call us all pissed off about his
Internet connection going into the toilet so that we can force him to
clean up his network immediately to get his router running again.
All of the low-end translators on the market - like that Linksys product
your selling now - work this way. If the customer gets infected with a
virus - bang, the router dies. It's a fantastic firestop to help limit
the rapid spread of trojans and viruses. As a manufacturer of a great
deal of equipment on the Internet you have a responsibility to the
rest of the world to NOT manufacture devices that ASSIST worms and
viruses to propagate, by keeping the Cisco router alive - even while
someone is pumping 50K infection attempts to the rest of
the world.
And yeah, sure, I know there's probably some service providers that
are running NAT in their big-iron routers. What I say to them is
hey, what gives you the right to support a field of customers that
are infected with viruses and trojans that are attacking -me-? If
you don't like your big iron router running NAT to keep crashing
then make your customers clean up their act. I make our customers
keep their noses clean, why don't you?
I'll close on this:
For years Sendmail's default was to permit promiscious relaying. Each
new version people who knew better screamed to get the default closed.
The sendmail maintainers didn't want to do it because they didn't want
to deal with the lazy asses bitching to them.
Finally under overwhelming pressure they capitulated. And sure enough
hundreds if not thousands of lazy people who didn't want to go to
the trouble of runinng auth-smtp and didn't want to do some work to
organize their network to close down the holes, screamed and moaned
about the new version. (despite the fact that turning on promiscious
relaying was simple - these are so-called administrators too lazy to
read the manual, mind) But in the long run if this hadn't happened,
e-mail would be unusable today.
It does not pay to modify your WAN networking products to support
people who are too lazy to do things the right way. The right way to
fix a router that keeps crashing because it's nat table is overloaded
by worms, is to clean the worms, it is not to introduce a crutch like
rate limiting that allows the router to keep the infected system
going. And if it's not worms doing it, but some so-called 'legitimate'
application, then the right way is to ban use of the application.
And in case nobody told you, no one that runs Kazza on the Internet
is using it for legal activities.
Ted
> -----Original Message-----
> From: Rodney Dunn [mailto:rodunn at cisco.com]
> Sent: Tuesday, December 28, 2004 1:55 PM
> To: Ted Mittelstaedt
> Cc: cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] help on NAT rate limiting
>
>
> Please provide more information so someone
> can help answer your question:
>
> Version of code (exactly)
> Configuration you are using
> etc..
>
> There were a lot of NAT changes that went
> it to 12.3(4)T for major scalability problems.
> There were also different changes made to give
> users the ability to do various rate limiting.
> Here is a good page on it:
>
> http://www.cisco.com/en/US/products/sw/iosswrel/ps5207/products_fe
> ature_guide09186a00801d09f0.html
>
> Defaults are not changed just for the heck of it
> and any time you change a timer you can find a scenario
> where it isn't the optimal value.
>
> Rodney
>
>
>
>
> On Tue, Dec 28, 2004 at 10:02:06AM -0800, Ted Mittelstaedt wrote:
> > Hi All,
> >
> > We have a customer that's a small office about 20 people
> > behind a 1720. The router is configured to overload on to
> > a single IP address, and has a vpn to another 1720 coming
> > in to it.
> >
> > They wanted another ethernet interface in this so we put a
> > wic-1enet card into the router - this required going to 12.3
> > ios to support the hardware and that is when all hell broke loose.
> >
> > previous to 12.3 the ios had no way to rate limit nat -
> > normally the translation table would run about a couple hundred
> > entries. Every once in a while they would get a virus and
> > the table would balloon - which would be simple to see by
> > showing the nat translation table, finding the offending inside
> > ip address, and removing the virus, the table would go back to
> > normal. They were running 12.1 on that 1700 for a year at
> > least with no other problems.
> >
> > Now with 12.3 there is a way to rate limit nat - but the
> > people at Cisco that thought this was a good idea
> > quite obviously figured they would -raise- all the timeouts
> > in the translator. So now, even without a virus, the router
> > will run on average of 20,000 translation entries sometimes.
> >
> > configuring rate limiting to wack off the table at 2-3 thousand
> > entries creates a situation where the router simply runs up
> > the translation table to the limit, then stops creating new
> > entries.
> >
> > We want to reset the timeouts in ios back to what they
> > were rather than trying to wack the table off at it's knees -
> > but there is no info I can find on the Cisco website as to
> > what the SENSIBLE timeouts were that were used in 12.1, 12.0,
> > etc. And furthermore the ios commands that are available for
> > reducing the timeouts don't apply to overloads - which of course
> > is what everything on this router is.
> >
> > Going back to an old IOS is not possible because of the
> > ethernet wic.
> >
> > Whoever did this at Cisco obviously never heard of the
> > axiom "if it ain't broke don't fix it". A nat rate-limiting
> > command is an impossibility - a virus will use all available
> > ram in the router for translation entries no matter how high
> > or how low the limit is set - and will just max out the translation
> > slots with the rate-limit set, and the router stops working,
> > so this command gains nothing. And to put a command like this
> > in and use it as a license to raise the timeouts which is what
> > it seems they have done is absurd.
> >
> > No doubt Cisco was besiged with idiots trying to press wussy-assed
> > routers into service as translators for fortune 100 companies -
> > they should have told those morons to go pound sand and buy a pix
> > and left the translation code for the small routers alone, it was
> > working fine before. Changing the translator operation in 12.3
> > has screwed it for everyone else I think.
> >
> > Please someone, tell me the documentation is wrong and that the
> > nat timeout commands do apply to overloads!
> >
> > Ted
> > _______________________________________________
> > cisco-nsp mailing list cisco-nsp at puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-nsp
> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>
More information about the cisco-nsp
mailing list