[cisco-nas] frequent reboot of AS-5200's - update
Aaron Leonard
Aaron at Cisco.COM
Tue Oct 19 17:32:08 EDT 2004
Ooh, those nasty worms looking for those nasty Microsoft ports.
One thing to do, if you can't run CEF, to protect your route
cache against virus-induced bloat, is to crank your cache ager
timers way up. I modeled this several years ago and found out
that setting "ip cache-ager 20 3 3" was a big win - it drastically
reduced route cache memory consumption when confronted with
infected clients doing address sweeps. This proved to help out
some folks back when Code Red was making the rounds.
http://www.cisco.com/warp/public/63/ts_codred_worm.shtml#subfifthone
Cheers,
Aaron
---
> For those interrested... an update
> We upgraded 11 boxes to 12.1(25) and 16 MB of memory. It slowed the rythm of
> reboots, but did not stop it.
> However, we were able to catch one box in the process of going down. We saw that
> the free memory was rather stable, but that the largest free block was slowly
> decreasing in size. Also :
> - the 'Check heaps' process was working like crazy.
> - the content of the 'sh ip cache' was enormous (including many /32's)
> - more than 50% of all incoming packets on Async ports were dropped by
> this ACL:
> access-list 109 deny tcp any any eq 135
> access-list 109 deny tcp any any eq 445
> access-list 109 deny tcp any any eq 5000
> access-list 109 deny icmp any any
> access-list 109 permit ip x.x.x.0 0.0.0.255 any ! poor-man's rpf
> access-list 109 deny ip any any
> When the largest memory went below a certain level (something like 1 or 2 K),
> the box simply rebooted. It could occur many times a day.
> All the symptoms pointed to a worm trying to spread.
> We changed the first line of the ACL this way :
> access-list 109 deny tcp any any range 135 139
> and we were now blocking more than 70% of incoming packets and stopped memory
> degradation. Boxes are now stable (we call 'rebooting only once a week' stable,
> in this case!). We will upgrade the more problematic boxes to 16MB, it should
> give them enough breathing room to last a few weeks... until the next worm...
> Anyway... this was to share the info and to thank all that offered help !
> -------------------------------------------------------------------
> Pierre Nepveu, CCNP tel: +1 514.380-4289
> Administrateur de reseau +1 888.INFOVTL x 4289
> Ingenierie / Acces Internet fax: +1 514 899-8452
> Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
> -------------------------------------------------------------------
> Le 2004-09-14 à 17:39, Pierre Nepveu a écrit:
> PN> Mark,
> PN>
> PN> thanks. I'll try that and keep you posted.
> PN>
> PN> pn
> PN> cd /pub; more beer
> PN>
> PN>
> PN> Le 2004-09-14 à 11:09, Mark Johnson a écrit:
> PN>
> PN> MJ> At 01:51 PM 9/14/2004 -0400, Pierre Nepveu wrote:
> PN> MJ> >hi guys,
> PN> MJ> >
> PN> MJ> >you both asked for <sh stack> within 7 minutes of one another ! We should
> PN> MJ> >have a
> PN> MJ> >new Olympic event : synchronized help desk just for you !
> PN> MJ>
> PN> MJ> There are quite a few bugs that may relate to this sort of problem (the
> PN> MJ> cause is likely ISDN and memory utilization), one of which is CSCdp40742,
> PN> MJ> fixed in 12.1(9).
> PN> MJ>
> PN> MJ> Is it possible to upgrade at least one of your router's (say to the latest
> PN> MJ> 12.1), to see how that reacts? Seems like you should no soon if it made a
> PN> MJ> positive improvement.
> PN> MJ>
> PN> MJ> mark
> PN> MJ>
> PN> MJ>
> PN> MJ> >Here goes :
> PN> MJ> >(this one is not very fresh... about 1 1/2 day. Below is the freshest I've
> PN> MJ> >got -
> PN> MJ> >about 5 hours. Thanks for your help!)
> PN> MJ> >
> PN> MJ> >as01-91q-mtl#sh stack
> PN> MJ> >Minimum process stacks:
> PN> MJ> > Free/Size Name
> PN> MJ> > 1732/2000 Reset ipc queue
> PN> MJ> > 2420/4000 Init
> PN> MJ> > 1220/2000 Microcom DSP download Process
> PN> MJ> > 1456/2000 RADIUS INITCONFIG
> PN> MJ> > 1012/2000 MAI Action Process
> PN> MJ> > 2884/4000 Exec
> PN> MJ> > 1672/2000 Async tty Reset
> PN> MJ> > 2272/4000 Virtual Exec
> PN> MJ> >
> PN> MJ> >Interrupt level stacks:
> PN> MJ> >Level Called Unused/Size Name
> PN> MJ> > 1 17193915 2544/3000 Async (CL-CD2430) transmit interrupts
> PN> MJ> > 2 20615196 2200/3000 Async (CD2430/Mica) receive interrupts
> PN> MJ> > 3 4596 2908/3000 Serial interface state change interrupt
> PN> MJ> > 4 22397619 2364/3000 Network interfaces
> PN> MJ> > 5 21868 2896/3000 Console Uart
> PN> MJ> > 6 2 2852/3000 DSX1 interface
> PN> MJ> >
> PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3, RELEASE SOFTWARE (fc1)
> PN> MJ> >TAC Support: http://www.cisco.com/tac
> PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
> PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >Stack trace from system failure:
> PN> MJ> >FP: 0x2132D4, RA: 0x221D281C
> PN> MJ> >FP: 0x2132E0, RA: 0x221D2AE2
> PN> MJ> >FP: 0x213304, RA: 0x221C893E
> PN> MJ> >FP: 0x213314, RA: 0x221C8A0A
> PN> MJ> >FP: 0x21331C, RA: 0x2215C950
> PN> MJ> >FP: 0x21332C, RA: 0x2215CC60
> PN> MJ> >FP: 0x213398, RA: 0x2215C9DA
> PN> MJ> >FP: 0x2133B4, RA: 0x2204DF1E
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >***************************************************
> PN> MJ> >******* Information of Last System Crash **********
> PN> MJ> >***************************************************
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >as01-91q-mtl#
> PN> MJ> >
> PN> MJ> > -/-/-/-/-/-/
> PN> MJ> >
> PN> MJ> >as05-mtl#sh ver
> PN> MJ> >Cisco Internetwork Operating System Software
> PN> MJ> >IOS (tm) 5200 Software (C5200-IS-L), Version 11.3(11b)T3, RELEASE SOFTWARE
> PN> MJ> >(fc1)
> PN> MJ> >TAC Support: http://www.cisco.com/tac
> PN> MJ> >Copyright (c) 1986-2003 by cisco Systems, Inc.
> PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong
> PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
> PN> MJ> >
> PN> MJ> >ROM: System Bootstrap, Version 11.1(474A) [jdisimon 104], INTERIM SOFTWARE
> PN> MJ> >BOOTFLASH: 5200 Software (AS5200-BOOT-L), Version 11.1(7)AA, EARLY DEPLOYMENT
> PN> MJ> >RELEASE SOFTWARE (fc2)
> PN> MJ> >
> PN> MJ> >as05-mtl uptime is 14 hours, 59 minutes
> PN> MJ> >System restarted by error - software forced crash, PC 0x221CEAD2 at
> PN> MJ> >22:46:16 EDT
> PN> MJ> >Mon Sep 13 2004
> PN> MJ> >System image file is "flash:c5200-is-l.113-11b.T3.bin", booted via flash
> PN> MJ> >
> PN> MJ> >cisco AS5200 (68030) processor (revision B) with 8192K/4096K bytes of memory.
> PN> MJ> >Processor board ID 04277090
> PN> MJ> >Bridging software.
> PN> MJ> >X.25 software, Version 3.0.0.
> PN> MJ> >SuperLAT software copyright 1990 by Meridian Technology Corp).
> PN> MJ> >Primary Rate ISDN software, Version 1.1.
> PN> MJ> >Mother board with terminator card.
> PN> MJ> >1 Ethernet/IEEE 802.3 interface(s)
> PN> MJ> >50 Serial network interface(s)
> PN> MJ> >48 terminal line(s)
> PN> MJ> >2 Channelized T1/PRI port(s)
> PN> MJ> >128K bytes of non-volatile configuration memory.
> PN> MJ> >8192K bytes of processor board System flash (Read ONLY)
> PN> MJ> >4096K bytes of processor board Boot flash (Read/Write)
> PN> MJ> >
> PN> MJ> >Configuration register is 0x2102
> PN> MJ> >
> PN> MJ> >as05-mtl#sh stack
> PN> MJ> >Minimum process stacks:
> PN> MJ> > Free/Size Name
> PN> MJ> > 1732/2000 Reset ipc queue
> PN> MJ> > 2420/4000 Init
> PN> MJ> > 1220/2000 Microcom DSP download Process
> PN> MJ> > 1452/2000 RADIUS INITCONFIG
> PN> MJ> > 2272/4000 Virtual Exec
> PN> MJ> > 1012/2000 MAI Action Process
> PN> MJ> > 2596/4000 Exec
> PN> MJ> > 1672/2000 Async tty Reset
> PN> MJ> >
> PN> MJ> >Interrupt level stacks:
> PN> MJ> >Level Called Unused/Size Name
> PN> MJ> > 1 5263579 2544/3000 Async (CL-CD2430) transmit interrupts
> PN> MJ> > 2 3874824 2200/3000 Async (CD2430/Mica) receive interrupts
> PN> MJ> > 3 2007 2908/3000 Serial interface state change interrupt
> PN> MJ> > 4 7170726 2364/3000 Network interfaces
> PN> MJ> > 5 17261 2896/3000 Console Uart
> PN> MJ> > 6 2 2852/3000 DSX1 interface
> PN> MJ> >
> PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3, RELEASE SOFTWARE (fc1)
> PN> MJ> >TAC Support: http://www.cisco.com/tac
> PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
> PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >Stack trace from system failure:
> PN> MJ> >FP: 0x213DEC, RA: 0x221D281C
> PN> MJ> >FP: 0x213DF8, RA: 0x221D2AE2
> PN> MJ> >FP: 0x213E1C, RA: 0x221C893E
> PN> MJ> >FP: 0x213E2C, RA: 0x221C8A0A
> PN> MJ> >FP: 0x213E34, RA: 0x2215C950
> PN> MJ> >FP: 0x213E44, RA: 0x2215CC60
> PN> MJ> >FP: 0x213EB0, RA: 0x2215C9DA
> PN> MJ> >FP: 0x213ECC, RA: 0x2204DF1E
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >***************************************************
> PN> MJ> >******* Information of Last System Crash **********
> PN> MJ> >***************************************************
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >as05-mtl#
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >pn
> PN> MJ> >cd /pub; more beer
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >Le 2004-09-14 à 10:31, Aaron Leonard a écrit:
> PN> MJ> >
> PN> MJ> >AL> Pierre,
> PN> MJ> >AL>
> PN> MJ> >AL>
> PN> MJ> >AL> Please provide the output of "show version" and "show stack"
> PN> MJ> >AL> from one of these 5200s after it has crashed and rebooted.
> PN> MJ> >AL>
> PN> MJ> >AL> Aaron
> PN> MJ> >AL>
> PN> MJ> >AL> --
> PN> MJ> >AL>
> PN> MJ> >AL> > hello all,
> PN> MJ> >AL>
> PN> MJ> >AL> > as of late, we experience sudden an unexplained reboots with the
> PN> MJ> >following
> PN> MJ> >AL> > symptoms :
> PN> MJ> >AL>
> PN> MJ> >AL> > as07-xxx uptime is 15 hours, 17 minutes
> PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 14:57:30 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as12-xxx uptime is 15 hours, 50 minutes
> PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 14:25:31 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as16-xxx uptime is 15 hours, 32 minutes
> PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 14:43:11 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as02-yyy uptime is 18 hours, 42 minutes
> PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 11:35:00 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as03-yyy uptime is 17 hours, 39 minutes
> PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 12:38:28 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > (xxx and yyy are 2 different POPs). As you can see, the reboots are
> PN> MJ> >synchronized
> PN> MJ> >AL> > in time. This leads me to think we have a reccurence of a problem we
> PN> MJ> >had a few
> PN> MJ> >AL> > months back : there is a virus or worm that overwhelms those poor
> PN> MJ> >boxes and
> PN> MJ> >AL> > forces a crash. The luser logs into a box, crashes it, hits redial,
> PN> MJ> >crashes
> PN> MJ> >AL> > another box, redials again, crashes yet a third box and then quits.
> PN> MJ> >AL>
> PN> MJ> >AL> > I see two possible solutions :
> PN> MJ> >AL> > 1. filter out the exact problem (if I can pinpoint it)
> PN> MJ> >AL> > 2. install an IOS that is immune to the problem (and will not
> PN> MJ> >introduce new
> PN> MJ> >AL> > ones, hopefully!)
> PN> MJ> >AL>
> PN> MJ> >AL> > --More info--
> PN> MJ> >AL> > Current IOS is : IOS (tm) 5200 Software (C5200-IS-L), Version
> PN> MJ> >11.3(11b)T3
> PN> MJ> >AL> > System image file is "flash:c5200-is-l.113-11b.T3.bin"
> PN> MJ> >AL>
> PN> MJ> >AL> > Applied filter is :
> PN> MJ> >AL> > interface Group-Async1
> PN> MJ> >AL> > ip unnumbered Ethernet0
> PN> MJ> >AL> > ip access-group 109 in
> PN> MJ> >AL>
> PN> MJ> >AL> > access-list 109 deny tcp any any eq 135
> PN> MJ> >AL> > access-list 109 deny tcp any any eq 445
> PN> MJ> >AL> > access-list 109 deny tcp any any eq 5000
> PN> MJ> >AL> > access-list 109 deny icmp any any
> PN> MJ> >AL> > access-list 109 permit ip xx.yy.zz.0 0.0.0.255 any
> PN> MJ> >AL> > access-list 109 deny ip any any
> PN> MJ> >AL>
> PN> MJ> >AL> > (where xx.yy.zz is the netblock this NAS belongs to - this is the
> PN> MJ> >poor man's
> PN> MJ> >AL> > reverse-path verify)
> PN> MJ> >AL>
> PN> MJ> >AL> > Any suggestion of improved filter is welcome. Any suggestion of an
> PN> MJ> >IOS that fits
> PN> MJ> >AL> > in
> PN> MJ> >AL> > cisco AS5200 (68030) processor (revision A) with 8192K/4096K
> PN> MJ> >bytes of memory.
> PN> MJ> >AL> > 8192K bytes of processor board System flash (Read ONLY)
> PN> MJ> >AL> > is alos welcome. (IP/Plus image not necessary - this just what we
> PN> MJ> >have not and
> PN> MJ> >AL> > it works, so - not broken, not fixed!)
> PN> MJ> >AL>
> PN> MJ> >AL> > Thanks !
> PN> MJ> >AL>
> PN> MJ> >AL> > -------------------------------------------------------------------
> PN> MJ> >AL> > Pierre Nepveu, CCNP tel: +1 514.380-4289
> PN> MJ> >AL> > Administrateur de reseau +1 888.INFOVTL x 4289
> PN> MJ> >AL> > Ingenierie / Acces Internet fax: +1 514 899-8452
> PN> MJ> >AL> > Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
> PN> MJ> >AL> > -------------------------------------------------------------------
> PN> MJ> >AL>
> PN> MJ> >AL>
> PN> MJ> >AL>
> PN> MJ> >AL> > _______________________________________________
> PN> MJ> >AL> > cisco-nas mailing list
> PN> MJ> >AL> > cisco-nas at puck.nether.net
> PN> MJ> >AL> > https://puck.nether.net/mailman/listinfo/cisco-nas
> PN> MJ> >AL>
> PN> MJ>
> PN> MJ>
> PN>
> PN>
> PN>
> PN> _______________________________________________
> PN> cisco-nas mailing list
> PN> cisco-nas at puck.nether.net
> PN> https://puck.nether.net/mailman/listinfo/cisco-nas
> PN>
> PN>
More information about the cisco-nas
mailing list