[cisco-nas] frequent reboot of AS-5200's - update

Aaron Leonard Aaron at Cisco.COM
Tue Oct 19 17:32:08 EDT 2004


Ooh, those nasty worms looking for those nasty Microsoft ports.

One thing to do, if you can't run CEF, to protect your route
cache against virus-induced bloat, is to crank your cache ager
timers way up.  I modeled this several years ago and found out
that setting "ip cache-ager 20 3 3" was a big win - it drastically
reduced route cache memory consumption when confronted with
infected clients doing address sweeps.  This proved to help out
some folks back when Code Red was making the rounds.
http://www.cisco.com/warp/public/63/ts_codred_worm.shtml#subfifthone

Cheers,

Aaron

---

> For those interrested... an update

> We upgraded 11 boxes to 12.1(25) and 16 MB of memory. It slowed the rythm of
> reboots, but did not stop it.

> However, we were able to catch one box in the process of going down. We saw that
> the free memory was rather stable, but that the largest free block was slowly
> decreasing in size. Also :
> - the 'Check heaps' process was working like crazy.
> - the content of the 'sh ip cache' was enormous (including many /32's)
> - more than 50% of all incoming packets on Async ports were dropped by
>   this ACL:
> access-list 109 deny   tcp any any eq 135
> access-list 109 deny   tcp any any eq 445
> access-list 109 deny   tcp any any eq 5000
> access-list 109 deny   icmp any any
> access-list 109 permit ip x.x.x.0 0.0.0.255 any ! poor-man's rpf
> access-list 109 deny   ip any any

> When the largest memory went below a certain level (something like 1 or 2 K),
> the box simply rebooted. It could occur many times a day.

> All the symptoms pointed to a worm trying to spread.

> We changed the first line of the ACL this way :
> access-list 109 deny   tcp any any range 135 139
> and we were now blocking more than 70% of incoming packets and stopped memory
> degradation. Boxes are now stable (we call 'rebooting only once a week' stable,
> in this case!). We will upgrade the more problematic boxes to 16MB, it should
> give them enough breathing room to last a few weeks... until the next worm...

> Anyway... this was to share the info and to thank all that offered help !

> -------------------------------------------------------------------
> Pierre Nepveu, CCNP                    tel: +1 514.380-4289
> Administrateur de reseau                    +1 888.INFOVTL x 4289
> Ingenierie / Acces Internet            fax: +1 514 899-8452
> Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
> -------------------------------------------------------------------


> Le 2004-09-14 à 17:39, Pierre Nepveu a écrit:

> PN> Mark,
> PN>
> PN> thanks. I'll try that and keep you posted.
> PN>
> PN> pn
> PN> cd /pub; more beer
> PN>
> PN>
> PN> Le 2004-09-14 à 11:09, Mark Johnson a écrit:
> PN>
> PN> MJ> At 01:51 PM 9/14/2004 -0400, Pierre Nepveu wrote:
> PN> MJ> >hi guys,
> PN> MJ> >
> PN> MJ> >you both asked for <sh stack> within 7 minutes of one another ! We should
> PN> MJ> >have a
> PN> MJ> >new Olympic event : synchronized help desk just for you !
> PN> MJ>
> PN> MJ> There are quite a few bugs that may relate to this sort of problem (the
> PN> MJ> cause is likely ISDN and memory utilization), one of which is CSCdp40742,
> PN> MJ> fixed in 12.1(9).
> PN> MJ>
> PN> MJ> Is it possible to upgrade at least one of your router's (say to the latest
> PN> MJ> 12.1), to see how that reacts?  Seems like you should no soon if it made a
> PN> MJ> positive improvement.
> PN> MJ>
> PN> MJ> mark
> PN> MJ>
> PN> MJ>
> PN> MJ> >Here goes :
> PN> MJ> >(this one is not very fresh... about 1 1/2 day. Below is the freshest I've
> PN> MJ> >got -
> PN> MJ> >about 5 hours. Thanks for your help!)
> PN> MJ> >
> PN> MJ> >as01-91q-mtl#sh stack
> PN> MJ> >Minimum process stacks:
> PN> MJ> >  Free/Size   Name
> PN> MJ> >  1732/2000   Reset ipc queue
> PN> MJ> >  2420/4000   Init
> PN> MJ> >  1220/2000   Microcom DSP download Process
> PN> MJ> >  1456/2000   RADIUS INITCONFIG
> PN> MJ> >  1012/2000   MAI Action Process
> PN> MJ> >  2884/4000   Exec
> PN> MJ> >  1672/2000   Async tty Reset
> PN> MJ> >  2272/4000   Virtual Exec
> PN> MJ> >
> PN> MJ> >Interrupt level stacks:
> PN> MJ> >Level    Called Unused/Size  Name
> PN> MJ> >   1    17193915   2544/3000  Async (CL-CD2430) transmit interrupts
> PN> MJ> >   2    20615196   2200/3000  Async (CD2430/Mica) receive interrupts
> PN> MJ> >   3        4596   2908/3000  Serial interface state change interrupt
> PN> MJ> >   4    22397619   2364/3000  Network interfaces
> PN> MJ> >   5       21868   2896/3000  Console Uart
> PN> MJ> >   6           2   2852/3000  DSX1 interface
> PN> MJ> >
> PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3,  RELEASE SOFTWARE (fc1)
> PN> MJ> >TAC Support: http://www.cisco.com/tac
> PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
> PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >Stack trace from system failure:
> PN> MJ> >FP: 0x2132D4, RA: 0x221D281C
> PN> MJ> >FP: 0x2132E0, RA: 0x221D2AE2
> PN> MJ> >FP: 0x213304, RA: 0x221C893E
> PN> MJ> >FP: 0x213314, RA: 0x221C8A0A
> PN> MJ> >FP: 0x21331C, RA: 0x2215C950
> PN> MJ> >FP: 0x21332C, RA: 0x2215CC60
> PN> MJ> >FP: 0x213398, RA: 0x2215C9DA
> PN> MJ> >FP: 0x2133B4, RA: 0x2204DF1E
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >***************************************************
> PN> MJ> >******* Information of Last System Crash **********
> PN> MJ> >***************************************************
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >as01-91q-mtl#
> PN> MJ> >
> PN> MJ> >  -/-/-/-/-/-/
> PN> MJ> >
> PN> MJ> >as05-mtl#sh ver
> PN> MJ> >Cisco Internetwork Operating System Software
> PN> MJ> >IOS (tm) 5200 Software (C5200-IS-L), Version 11.3(11b)T3,  RELEASE SOFTWARE
> PN> MJ> >(fc1)
> PN> MJ> >TAC Support: http://www.cisco.com/tac
> PN> MJ> >Copyright (c) 1986-2003 by cisco Systems, Inc.
> PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong
> PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
> PN> MJ> >
> PN> MJ> >ROM: System Bootstrap, Version 11.1(474A) [jdisimon 104], INTERIM SOFTWARE
> PN> MJ> >BOOTFLASH: 5200 Software (AS5200-BOOT-L), Version 11.1(7)AA, EARLY DEPLOYMENT
> PN> MJ> >RELEASE SOFTWARE (fc2)
> PN> MJ> >
> PN> MJ> >as05-mtl uptime is 14 hours, 59 minutes
> PN> MJ> >System restarted by error - software forced crash, PC 0x221CEAD2 at
> PN> MJ> >22:46:16 EDT
> PN> MJ> >Mon Sep 13 2004
> PN> MJ> >System image file is "flash:c5200-is-l.113-11b.T3.bin", booted via flash
> PN> MJ> >
> PN> MJ> >cisco AS5200 (68030) processor (revision B) with 8192K/4096K bytes of memory.
> PN> MJ> >Processor board ID 04277090
> PN> MJ> >Bridging software.
> PN> MJ> >X.25 software, Version 3.0.0.
> PN> MJ> >SuperLAT software copyright 1990 by Meridian Technology Corp).
> PN> MJ> >Primary Rate ISDN software, Version 1.1.
> PN> MJ> >Mother board with terminator card.
> PN> MJ> >1 Ethernet/IEEE 802.3 interface(s)
> PN> MJ> >50 Serial network interface(s)
> PN> MJ> >48 terminal line(s)
> PN> MJ> >2 Channelized T1/PRI port(s)
> PN> MJ> >128K bytes of non-volatile configuration memory.
> PN> MJ> >8192K bytes of processor board System flash (Read ONLY)
> PN> MJ> >4096K bytes of processor board Boot flash (Read/Write)
> PN> MJ> >
> PN> MJ> >Configuration register is 0x2102
> PN> MJ> >
> PN> MJ> >as05-mtl#sh stack
> PN> MJ> >Minimum process stacks:
> PN> MJ> >  Free/Size   Name
> PN> MJ> >  1732/2000   Reset ipc queue
> PN> MJ> >  2420/4000   Init
> PN> MJ> >  1220/2000   Microcom DSP download Process
> PN> MJ> >  1452/2000   RADIUS INITCONFIG
> PN> MJ> >  2272/4000   Virtual Exec
> PN> MJ> >  1012/2000   MAI Action Process
> PN> MJ> >  2596/4000   Exec
> PN> MJ> >  1672/2000   Async tty Reset
> PN> MJ> >
> PN> MJ> >Interrupt level stacks:
> PN> MJ> >Level    Called Unused/Size  Name
> PN> MJ> >   1     5263579   2544/3000  Async (CL-CD2430) transmit interrupts
> PN> MJ> >   2     3874824   2200/3000  Async (CD2430/Mica) receive interrupts
> PN> MJ> >   3        2007   2908/3000  Serial interface state change interrupt
> PN> MJ> >   4     7170726   2364/3000  Network interfaces
> PN> MJ> >   5       17261   2896/3000  Console Uart
> PN> MJ> >   6           2   2852/3000  DSX1 interface
> PN> MJ> >
> PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3,  RELEASE SOFTWARE (fc1)
> PN> MJ> >TAC Support: http://www.cisco.com/tac
> PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
> PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >Stack trace from system failure:
> PN> MJ> >FP: 0x213DEC, RA: 0x221D281C
> PN> MJ> >FP: 0x213DF8, RA: 0x221D2AE2
> PN> MJ> >FP: 0x213E1C, RA: 0x221C893E
> PN> MJ> >FP: 0x213E2C, RA: 0x221C8A0A
> PN> MJ> >FP: 0x213E34, RA: 0x2215C950
> PN> MJ> >FP: 0x213E44, RA: 0x2215CC60
> PN> MJ> >FP: 0x213EB0, RA: 0x2215C9DA
> PN> MJ> >FP: 0x213ECC, RA: 0x2204DF1E
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >***************************************************
> PN> MJ> >******* Information of Last System Crash **********
> PN> MJ> >***************************************************
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >as05-mtl#
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >pn
> PN> MJ> >cd /pub; more beer
> PN> MJ> >
> PN> MJ> >
> PN> MJ> >Le 2004-09-14 à 10:31, Aaron Leonard a écrit:
> PN> MJ> >
> PN> MJ> >AL> Pierre,
> PN> MJ> >AL>
> PN> MJ> >AL>
> PN> MJ> >AL> Please provide the output of "show version" and "show stack"
> PN> MJ> >AL> from one of these 5200s after it has crashed and rebooted.
> PN> MJ> >AL>
> PN> MJ> >AL> Aaron
> PN> MJ> >AL>
> PN> MJ> >AL> --
> PN> MJ> >AL>
> PN> MJ> >AL> > hello all,
> PN> MJ> >AL>
> PN> MJ> >AL> > as of late, we experience sudden an unexplained reboots with the
> PN> MJ> >following
> PN> MJ> >AL> > symptoms :
> PN> MJ> >AL>
> PN> MJ> >AL> > as07-xxx uptime is 15 hours, 17 minutes
> PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 14:57:30 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as12-xxx uptime is 15 hours, 50 minutes
> PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 14:25:31 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as16-xxx uptime is 15 hours, 32 minutes
> PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 14:43:11 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as02-yyy uptime is 18 hours, 42 minutes
> PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 11:35:00 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > as03-yyy uptime is 17 hours, 39 minutes
> PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2
> PN> MJ> >at 12:38:28 EDT Sun Sep 5 2004
> PN> MJ> >AL>
> PN> MJ> >AL> > (xxx and yyy are 2 different POPs). As you can see, the reboots are
> PN> MJ> >synchronized
> PN> MJ> >AL> > in time. This leads me to think we have a reccurence of a problem we
> PN> MJ> >had a few
> PN> MJ> >AL> > months back : there is a virus or worm that overwhelms those poor
> PN> MJ> >boxes and
> PN> MJ> >AL> > forces a crash. The luser logs into a box, crashes it, hits redial,
> PN> MJ> >crashes
> PN> MJ> >AL> > another box, redials again, crashes yet a third box and then quits.
> PN> MJ> >AL>
> PN> MJ> >AL> > I see two possible solutions :
> PN> MJ> >AL> > 1. filter out the exact problem (if I can pinpoint it)
> PN> MJ> >AL> > 2. install an IOS that is immune to the problem (and will not
> PN> MJ> >introduce new
> PN> MJ> >AL> >    ones, hopefully!)
> PN> MJ> >AL>
> PN> MJ> >AL> > --More info--
> PN> MJ> >AL> > Current IOS is : IOS (tm) 5200 Software (C5200-IS-L), Version
> PN> MJ> >11.3(11b)T3
> PN> MJ> >AL> >   System image file is "flash:c5200-is-l.113-11b.T3.bin"
> PN> MJ> >AL>
> PN> MJ> >AL> > Applied filter is :
> PN> MJ> >AL> > interface Group-Async1
> PN> MJ> >AL> >  ip unnumbered Ethernet0
> PN> MJ> >AL> >  ip access-group 109 in
> PN> MJ> >AL>
> PN> MJ> >AL> > access-list 109 deny   tcp any any eq 135
> PN> MJ> >AL> > access-list 109 deny   tcp any any eq 445
> PN> MJ> >AL> > access-list 109 deny   tcp any any eq 5000
> PN> MJ> >AL> > access-list 109 deny   icmp any any
> PN> MJ> >AL> > access-list 109 permit ip xx.yy.zz.0 0.0.0.255 any
> PN> MJ> >AL> > access-list 109 deny   ip any any
> PN> MJ> >AL>
> PN> MJ> >AL> > (where xx.yy.zz is the netblock this NAS belongs to - this is the
> PN> MJ> >poor man's
> PN> MJ> >AL> > reverse-path verify)
> PN> MJ> >AL>
> PN> MJ> >AL> > Any suggestion of improved filter is welcome. Any suggestion of an
> PN> MJ> >IOS that fits
> PN> MJ> >AL> > in
> PN> MJ> >AL> >    cisco AS5200 (68030) processor (revision A) with 8192K/4096K
> PN> MJ> >bytes of memory.
> PN> MJ> >AL> >    8192K bytes of processor board System flash (Read ONLY)
> PN> MJ> >AL> > is alos welcome. (IP/Plus image not necessary - this just what we
> PN> MJ> >have not and
> PN> MJ> >AL> > it works, so - not broken, not fixed!)
> PN> MJ> >AL>
> PN> MJ> >AL> > Thanks !
> PN> MJ> >AL>
> PN> MJ> >AL> > -------------------------------------------------------------------
> PN> MJ> >AL> > Pierre Nepveu, CCNP                    tel: +1 514.380-4289
> PN> MJ> >AL> > Administrateur de reseau                    +1 888.INFOVTL x 4289
> PN> MJ> >AL> > Ingenierie / Acces Internet            fax: +1 514 899-8452
> PN> MJ> >AL> > Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
> PN> MJ> >AL> > -------------------------------------------------------------------
> PN> MJ> >AL>
> PN> MJ> >AL>
> PN> MJ> >AL>
> PN> MJ> >AL> > _______________________________________________
> PN> MJ> >AL> > cisco-nas mailing list
> PN> MJ> >AL> > cisco-nas at puck.nether.net
> PN> MJ> >AL> > https://puck.nether.net/mailman/listinfo/cisco-nas
> PN> MJ> >AL>
> PN> MJ>
> PN> MJ>
> PN>
> PN>
> PN>
> PN> _______________________________________________
> PN> cisco-nas mailing list
> PN> cisco-nas at puck.nether.net
> PN> https://puck.nether.net/mailman/listinfo/cisco-nas
> PN>
> PN>




More information about the cisco-nas mailing list