[c-nsp] help on NAT rate limiting

Ted Mittelstaedt tedm at toybox.placo.com
Thu Dec 30 01:38:23 EST 2004


c1700-k9o3sy7-mz.123-12.bin

But the problem also happened with

c1600-y-mz.123-10.bin or c1600-y-mz.123-11.bin
or some such, last month.  At the time then I didn't
suspect an IOS problem, I put it down to bad hardware.

They aren't running H.245 over this as far as I know.

> Do you have a sniffer trace or debug ip nat to
> show that a translation is there when the
> TCP RST/FIN was presented to the box such that
> the translation should have been torn down?

Well, it's only my opinion that a RST/FIN should
tear down a dynamic tcp translation entry, of course. :-)  I
would be interested in an explanation to the
contrary. :-) :-)

At this point, I don't want to muck further with this
customer - we are already on a bit of thin ice with them
as we basically railroaded them into buying the works,
IOS IPSec licenses for both routers, a new router for
the remote California office, Cisco service for both
routers, etc.  All to replace these 2 SonicWall
firewall/vpn devices that one of their hand-picked
buffoon-I-mean-consultants swore on a stack of Bibles
would be the best VPN solution this side of the
Mississippi.  Said consultant mucked with this for a
year and never got it to work at all - causing the
customer to have enough pain so they were willing to
not cheap it out the next time someone tried.

If I was going to deal with them further I'd run it
though COO on their service contract.  Since the hack of
setting the TCP timeout to 20 minutes is working for now,
I'd rather not disturb them.

However, I can test-bench this one on a 1601 for you
guys.  That's easy enough to do, and I can control
the misbehavior then a lot better, plus I can install
a sniffer box on it.  And I won't have
a problem with opening it up wide so you can get into
it and play around at your convenience.

I'll put the router up on the Internet tomorrow and
if I can duplicate the misbehavior, which I think I can,
I'll post a test set on it so everyone can see what
I'm dealing with.

Ted

> -----Original Message-----
> From: Rodney Dunn [mailto:rodunn at cisco.com]
> Sent: Wednesday, December 29, 2004 7:14 AM
> To: Ted Mittelstaedt
> Cc: Rodney Dunn; cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] help on NAT rate limiting
>
>
> 12.3 what.  I need the exact version so
> I can verify on some bugs that are listed.
>
> ie:
>
> CSCec08867
> Internally found severe defect: Verified (V)
> NAT translations for TCP sessions opened for H245 arent torndown
>
> fixed in 12.3(5.2) for example.
>
> I haven't heard of anything about the default
> timeout of 24 hours being changed.  I'll try and
> help solve the problem but first we have to
> determine exactly what the problem is.
> ie: did the default change (and what default) or
> is it a bug.
>
> Do you have a sniffer trace or debug ip nat to
> show that a translation is there when the
> TCP RST/FIN was presented to the box such that
> the translation should have been torn down?
>
> As for the other comments there are hundreds
> of customers (both SP and enterprise) that ask
> for rate limiting functionality for DOS protection.
>
> Rodney
>
>
>
> On Tue, Dec 28, 2004 at 11:25:38PM -0800, Ted Mittelstaedt wrote:
> >
> > As I mentioned on the other mail, this problem with the
> > thousands of TCP translation entries persisting in the
> > router for up to 24 hours has been present in several
> > versions of IOS 12.3 that I've tried, in several different
> > customers and several different routers.  (1721 and 1601)
> > It is present in the most current IOS.
> >
> > Here is the config used in the 1601 - 4MB dram, 8MB flash - this works
> > perfectly under
> > 12.1.25 and blows up under 12.3
> >
> > !
> > ! Last configuration change at 14:11:18 PST Tue Dec 7 2004
> > ! NVRAM config last updated at 14:11:27 PST Tue Dec 7 2004
> > !
> > version 12.1
> > no service single-slot-reload-enable
> > service timestamps debug uptime
> > service timestamps log datetime localtime show-timezone
> > no service password-encryption
> > !
> > hostname eatme
> > !
> > logging buffered 4096 debugging
> > no logging console
> > enable password eatme
> > !
> > !
> > !
> > !
> > !
> > clock timezone PST -8
> > clock summer-time PDT recurring
> > ip subnet-zero
> > ip name-server 1.1.1.1
> > ip name-server 2.2.2.2
> > ip dhcp excluded-address 192.168.1.1 192.168.1.99
> > ip dhcp excluded-address 192.168.1.201 192.168.1.254
> > !
> > ip dhcp pool 1
> >    network 192.168.1.0 255.255.255.0
> >    default-router 192.168.1.1
> >    dns-server 192.168.1.2 8.8.8.8
> >    domain-name eatme.local
> >    netbios-node-type m-node
> >    netbios-name-server 192.168.1.2
> > !
> > !
> > !
> > !
> > interface Ethernet0
> >  ip address 192.168.1.1 255.255.255.0
> >  ip nat inside
> > !
> > interface Serial0
> >  no ip address
> >  encapsulation frame-relay IETF
> >  no keepalive
> > !
> > interface Serial0.1 point-to-point
> >  ip address 18.17.4.158 255.255.255.252
> >  ip nat outside
> >  frame-relay interface-dlci 16
> > !
> > !
> > ip nat pool net-18 18.17.42.225 18.17.42.225 netmask 255.255.255.248
> > ip nat inside source list 1 pool net-18 overload
> > ip nat inside source static tcp 192.168.1.2 3389 18.17.42.226 3390
> > extendable
> > ip nat inside source static tcp 192.168.1.3 3389 18.17.42.226 3389
> > extendable
> > ip nat inside source static tcp 192.168.1.2 443 18.17.42.226
> 443 extendable
> > ip nat inside source static tcp 192.168.1.2 80 18.17.42.226 80
> extendable
> > ip nat inside source static tcp 192.168.1.2 25 18.17.42.226 25
> extendable
> > ip classless
> > ip route 0.0.0.0 0.0.0.0 18.17.4.157
> > no ip http server
> > !
> > access-list 1 permit 192.168.1.0 0.0.0.255
> > !
> > line con 0
> > line vty 0 4
> >  password eatme
> >  login
> > !
> > sntp server 6.12.8.10
> > end
> >
> >
> > Here is the config used on the 1721 - 64MB dram,  32MB flash.
> There is one
> > difference in it than prior - the serial interface is replaced
> with an enet
> > interface.  Under 12.2 and earlier plus the serial interface it
> was fine,
> > under 12.3 with
> > the ethernet interface it blows up:
> >
> > ! No configuration change since last restart
> > !
> > version 12.3
> > service timestamps debug uptime
> > service timestamps log datetime localtime show-timezone
> > no service password-encryption
> > !
> > hostname eatme2
> > !
> > boot-start-marker
> > boot-end-marker
> > !
> > logging buffered 4096 debugging
> > no logging console
> > enable password eatme2
> > !
> > clock timezone PST -8
> > clock summer-time PDT recurring
> > mmi polling-interval 60
> > no mmi auto-configure
> > no mmi pvc
> > mmi snmp-timeout 180
> > no aaa new-model
> > ip subnet-zero
> > no ip source-route
> > !
> > !
> > ip domain name eatme2.com
> > ip name-server 1.1.1.1
> > ip name-server 2.2.2.2
> > ip dhcp excluded-address 10.0.0.1 10.0.0.99
> > !
> > ip dhcp pool 1
> >    network 10.0.0.0 255.255.255.0
> >    default-router 10.0.0.1
> >    netbios-node-type h-node
> >    netbios-name-server 10.0.0.3
> >    dns-server 10.0.0.4 3.3.3.3
> > !
> > ip cef
> > ip audit po max-events 100
> > no ftp-server write-enable
> > !
> > !
> > !
> > !
> > !
> > crypto isakmp policy 11
> >  hash md5
> >  authentication pre-share
> > crypto isakmp key exhibit-me address 6.16.4.162
> > !
> > !
> > crypto ipsec transform-set eatme2-or esp-des esp-md5-hmac
> > !
> > crypto map nolan 11 ipsec-isakmp
> >  set peer 6.16.4.162
> >  set transform-set eatme2-or
> >  match address 100
> > !
> > !
> > !
> > interface Ethernet0
> >  ip address 65.7.19.60 255.255.255.248 secondary
> >  ip address 65.7.19.61 255.255.255.248 secondary
> >  ip address 65.7.19.58 255.255.255.248
> >  ip nat outside
> >  full-duplex
> >  crypto map nolan
> > !
> > interface FastEthernet0
> >  ip address 10.0.0.1 255.255.255.0
> >  ip nat inside
> >  speed auto
> > !
> > ip nat inside source static tcp 10.0.0.3 25 interface Ethernet0 25
> > ip nat inside source static tcp 10.0.0.3 110 interface Ethernet0 110
> > ip nat inside source static tcp 10.0.0.3 3389 interface Ethernet0 3389
> > ip nat inside source static tcp 10.0.0.253 20 interface Ethernet0 20
> > ip nat inside source static tcp 10.0.0.253 21 interface Ethernet0 21
> > ip nat inside source route-map nonat interface Ethernet0 overload
> > ip nat inside source static tcp 10.0.0.3 80 65.7.19.61 80 extendable
> > ip nat inside source static tcp 10.0.0.4 80 65.7.19.60 80 extendable
> > ip classless
> > ip route 0.0.0.0 0.0.0.0 65.7.19.57
> > no ip http server
> > no ip http secure-server
> > !
> > !
> > access-list 1 permit 10.0.0.0 0.0.0.255
> > access-list 100 permit ip 10.0.0.0 0.0.0.255 192.168.0.0 0.0.0.255
> > access-list 110 deny   ip 10.0.0.0 0.0.0.255 192.168.0.0 0.0.0.255
> > access-list 110 permit ip 10.0.0.0 0.0.0.255 any
> > !
> > route-map nonat permit 10
> >  match ip address 110
> > !
> > !
> > line con 0
> > line aux 0
> > line vty 0 4
> >  password eatme2
> >  login
> > !
> > sntp server 8.8.8.8
> > end
> >
> > > and any time you change a timer you can find a scenario
> > > where it isn't the optimal value.
> >
> > Let me point out that in the past when Cisco changed the default
> > on IP directed broadcast from a default of ON to a default of OFF,
> > to shut down the fun of the smurf crowd,
> > that there was at least one IOS rev where on a blank config, the
> > statement
> >
> > no ip directed broadcast
> >
> > automagically appeared on every interface in the router.  Previously
> > the default was for directed broadcasts to be on - and when Cisco
> > decided to default them off, you make it so that the user would see
> > that there change was right there in the config.
> >
> > This is how you should be doing it if you deviate on the NAT settings
> > from your prior defaults.  If 12.3 extends the tcp settings to 24
> > hours from 5 hours, then it should show that as a statement that
> > automatically appears in the config.  Don't just spring it on the
> > user unawares.
> >
> > And I might also add that the URL referenced while it tells all about
> > the changes, it DOESEN'T tell what the PREVIOUS defaults were in the
> > older IOS.  NOR does it tell if you did NOT make changes in the
> > previous defaults.
> >
> > I liked the old way that the translator worked.  Sure, if a customer
> > got a worm then their router crapped out and knocked them off the
> > Internet.  As a service provider WE LIKE THAT.  The last thing I want
> > is some customer too stupid to run antivirus internally to saturate
> > his connection to me and impact all my other users with his bazillions
> > of virus attempts to infect the rest of the world.
> >
> > The goal should definitely NOT be a goal of keeping the router alive
> > during the time that a customer is infected and trying to infect the
> > rest of the world.  we WANT him to call us all pissed off about his
> > Internet connection going into the toilet so that we can force him to
> > clean up his network immediately to get his router running again.
> >
> > All of the low-end translators on the market - like that Linksys product
> > your selling now - work this way.  If the customer gets infected with a
> > virus - bang, the router dies.  It's a fantastic firestop to help limit
> > the rapid spread of trojans and viruses.  As a manufacturer of a great
> > deal of equipment on the Internet you have a responsibility to the
> > rest of the world to NOT manufacture devices that ASSIST worms and
> > viruses to propagate, by keeping the Cisco router alive - even while
> > someone is pumping 50K infection attempts to the rest of
> > the world.
> >
> > And yeah, sure, I know there's probably some service providers that
> > are running NAT in their big-iron routers.  What I say to them is
> > hey, what gives you the right to support a field of customers that
> > are infected with viruses and trojans that are attacking -me-?  If
> > you don't like your big iron router running NAT to keep crashing
> > then make your customers clean up their act.  I make our customers
> > keep their noses clean, why don't you?
> >
> > I'll close on this:
> >
> > For years Sendmail's default was to permit promiscious relaying.  Each
> > new version people who knew better screamed to get the default closed.
> > The sendmail maintainers didn't want to do it because they didn't want
> > to deal with the lazy asses bitching to them.
> >
> > Finally under overwhelming pressure they capitulated.  And sure enough
> > hundreds if not thousands of lazy people who didn't want to go to
> > the trouble of runinng auth-smtp and didn't want to do some work to
> > organize their network to close down the holes, screamed and moaned
> > about the new version.  (despite the fact that turning on promiscious
> > relaying was simple - these are so-called administrators too lazy to
> > read the manual, mind) But in the long run if this hadn't happened,
> > e-mail would be unusable today.
> >
> > It does not pay to modify your WAN networking products to support
> > people who are too lazy to do things the right way.  The right way to
> > fix a router that keeps crashing because it's nat table is overloaded
> > by worms, is to clean the worms, it is not to introduce a crutch like
> > rate limiting that allows the router to keep the infected system
> > going.  And if it's not worms doing it, but some so-called 'legitimate'
> > application, then the right way is to ban use of the application.
> > And in case nobody told you, no one that runs Kazza on the Internet
> > is using it for legal activities.
> >
> > Ted
> >
> >
> > > -----Original Message-----
> > > From: Rodney Dunn [mailto:rodunn at cisco.com]
> > > Sent: Tuesday, December 28, 2004 1:55 PM
> > > To: Ted Mittelstaedt
> > > Cc: cisco-nsp at puck.nether.net
> > > Subject: Re: [c-nsp] help on NAT rate limiting
> > >
> > >
> > > Please provide more information so someone
> > > can help answer your question:
> > >
> > > Version of code (exactly)
> > > Configuration you are using
> > > etc..
> > >
> > > There were a lot of NAT changes that went
> > > it to 12.3(4)T for major scalability problems.
> > > There were also different changes made to give
> > > users the ability to do various rate limiting.
> > > Here is a good page on it:
> > >
> > > http://www.cisco.com/en/US/products/sw/iosswrel/ps5207/products_fe
> > > ature_guide09186a00801d09f0.html
> > >
> > > Defaults are not changed just for the heck of it
> > > and any time you change a timer you can find a scenario
> > > where it isn't the optimal value.
> > >
> > > Rodney
> > >
> > >
> > >
> > >
> > > On Tue, Dec 28, 2004 at 10:02:06AM -0800, Ted Mittelstaedt wrote:
> > > > Hi All,
> > > >
> > > >   We have a customer that's a small office about 20 people
> > > > behind a 1720.  The router is configured to overload on to
> > > > a single IP address, and has a vpn to another 1720 coming
> > > > in to it.
> > > >
> > > >   They wanted another ethernet interface in this so we put a
> > > > wic-1enet card into the router - this required going to 12.3
> > > > ios to support the hardware and that is when all hell broke loose.
> > > >
> > > >   previous to 12.3 the ios had no way to rate limit nat -
> > > > normally the translation table would run about a couple hundred
> > > > entries.  Every once in a while they would get a virus and
> > > > the table would balloon - which would be simple to see by
> > > > showing the nat translation table, finding the offending inside
> > > > ip address, and removing the virus, the table would go back to
> > > > normal.  They were running 12.1 on that 1700 for a year at
> > > > least with no other problems.
> > > >
> > > >   Now with 12.3 there is a way to rate limit nat - but the
> > > > people at Cisco that thought this was a good idea
> > > > quite obviously figured they would -raise- all the timeouts
> > > > in the translator.  So now, even without a virus, the router
> > > > will run on average of 20,000 translation entries sometimes.
> > > >
> > > >   configuring rate limiting to wack off the table at 2-3 thousand
> > > > entries creates a situation where the router simply runs up
> > > > the translation table to the limit, then stops creating new
> > > > entries.
> > > >
> > > >   We want to reset the timeouts in ios back to what they
> > > > were rather than trying to wack the table off at it's knees -
> > > > but there is no info I can find on the Cisco website as to
> > > > what the SENSIBLE timeouts were that were used in 12.1, 12.0,
> > > > etc.  And furthermore the ios commands that are available for
> > > > reducing the timeouts don't apply to overloads - which of course
> > > > is what everything on this router is.
> > > >
> > > >   Going back to an old IOS is not possible because of the
> > > > ethernet wic.
> > > >
> > > >   Whoever did this at Cisco obviously never heard of the
> > > > axiom "if it ain't broke don't fix it".  A nat rate-limiting
> > > > command is an impossibility - a virus will use all available
> > > > ram in the router for translation entries no matter how high
> > > > or how low the limit is set - and will just max out the translation
> > > > slots with the rate-limit set, and the router stops working,
> > > > so this command gains nothing.  And to put a command like this
> > > > in and use it as a license to raise the timeouts which is what
> > > > it seems they have done is absurd.
> > > >
> > > >   No doubt Cisco was besiged with idiots trying to press wussy-assed
> > > > routers into service as translators for fortune 100 companies -
> > > > they should have told those morons to go pound sand and buy a pix
> > > > and left the translation code for the small routers alone, it was
> > > > working fine before.  Changing the translator operation in 12.3
> > > > has screwed it for everyone else I think.
> > > >
> > > >   Please someone, tell me the documentation is wrong and that the
> > > > nat timeout commands do apply to overloads!
> > > >
> > > > Ted
> > > > _______________________________________________
> > > > cisco-nsp mailing list  cisco-nsp at puck.nether.net
> > > > https://puck.nether.net/mailman/listinfo/cisco-nsp
> > > > archive at http://puck.nether.net/pipermail/cisco-nsp/
> > >
>



More information about the cisco-nsp mailing list