[cisco-nas] frequent reboot of AS-5200's - update

Pierre Nepveu pnepveu at videotron.net
Tue Oct 19 17:30:46 EDT 2004


For those interrested... an update

We upgraded 11 boxes to 12.1(25) and 16 MB of memory. It slowed the rythm of
reboots, but did not stop it.

However, we were able to catch one box in the process of going down. We saw that
the free memory was rather stable, but that the largest free block was slowly
decreasing in size. Also :
- the 'Check heaps' process was working like crazy. 
- the content of the 'sh ip cache' was enormous (including many /32's)
- more than 50% of all incoming packets on Async ports were dropped by 
  this ACL:
access-list 109 deny   tcp any any eq 135
access-list 109 deny   tcp any any eq 445
access-list 109 deny   tcp any any eq 5000
access-list 109 deny   icmp any any
access-list 109 permit ip x.x.x.0 0.0.0.255 any ! poor-man's rpf
access-list 109 deny   ip any any

When the largest memory went below a certain level (something like 1 or 2 K),
the box simply rebooted. It could occur many times a day.

All the symptoms pointed to a worm trying to spread.

We changed the first line of the ACL this way :
access-list 109 deny   tcp any any range 135 139
and we were now blocking more than 70% of incoming packets and stopped memory
degradation. Boxes are now stable (we call 'rebooting only once a week' stable,
in this case!). We will upgrade the more problematic boxes to 16MB, it should
give them enough breathing room to last a few weeks... until the next worm... 

Anyway... this was to share the info and to thank all that offered help !

-------------------------------------------------------------------
Pierre Nepveu, CCNP                    tel: +1 514.380-4289 
Administrateur de reseau                    +1 888.INFOVTL x 4289
Ingenierie / Acces Internet            fax: +1 514 899-8452
Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
-------------------------------------------------------------------


Le 2004-09-14 à 17:39, Pierre Nepveu a écrit:

PN> Mark,
PN> 
PN> thanks. I'll try that and keep you posted.
PN> 
PN> pn
PN> cd /pub; more beer
PN> 
PN> 
PN> Le 2004-09-14 à 11:09, Mark Johnson a écrit:
PN> 
PN> MJ> At 01:51 PM 9/14/2004 -0400, Pierre Nepveu wrote:
PN> MJ> >hi guys,
PN> MJ> >
PN> MJ> >you both asked for <sh stack> within 7 minutes of one another ! We should 
PN> MJ> >have a
PN> MJ> >new Olympic event : synchronized help desk just for you !
PN> MJ> 
PN> MJ> There are quite a few bugs that may relate to this sort of problem (the
PN> MJ> cause is likely ISDN and memory utilization), one of which is CSCdp40742,
PN> MJ> fixed in 12.1(9).
PN> MJ> 
PN> MJ> Is it possible to upgrade at least one of your router's (say to the latest
PN> MJ> 12.1), to see how that reacts?  Seems like you should no soon if it made a
PN> MJ> positive improvement.
PN> MJ> 
PN> MJ> mark
PN> MJ> 
PN> MJ> 
PN> MJ> >Here goes :
PN> MJ> >(this one is not very fresh... about 1 1/2 day. Below is the freshest I've 
PN> MJ> >got -
PN> MJ> >about 5 hours. Thanks for your help!)
PN> MJ> >
PN> MJ> >as01-91q-mtl#sh stack
PN> MJ> >Minimum process stacks:
PN> MJ> >  Free/Size   Name
PN> MJ> >  1732/2000   Reset ipc queue
PN> MJ> >  2420/4000   Init
PN> MJ> >  1220/2000   Microcom DSP download Process
PN> MJ> >  1456/2000   RADIUS INITCONFIG
PN> MJ> >  1012/2000   MAI Action Process
PN> MJ> >  2884/4000   Exec
PN> MJ> >  1672/2000   Async tty Reset
PN> MJ> >  2272/4000   Virtual Exec
PN> MJ> >
PN> MJ> >Interrupt level stacks:
PN> MJ> >Level    Called Unused/Size  Name
PN> MJ> >   1    17193915   2544/3000  Async (CL-CD2430) transmit interrupts
PN> MJ> >   2    20615196   2200/3000  Async (CD2430/Mica) receive interrupts
PN> MJ> >   3        4596   2908/3000  Serial interface state change interrupt
PN> MJ> >   4    22397619   2364/3000  Network interfaces
PN> MJ> >   5       21868   2896/3000  Console Uart
PN> MJ> >   6           2   2852/3000  DSX1 interface
PN> MJ> >
PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3,  RELEASE SOFTWARE (fc1)
PN> MJ> >TAC Support: http://www.cisco.com/tac
PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
PN> MJ> >
PN> MJ> >
PN> MJ> >Stack trace from system failure:
PN> MJ> >FP: 0x2132D4, RA: 0x221D281C
PN> MJ> >FP: 0x2132E0, RA: 0x221D2AE2
PN> MJ> >FP: 0x213304, RA: 0x221C893E
PN> MJ> >FP: 0x213314, RA: 0x221C8A0A
PN> MJ> >FP: 0x21331C, RA: 0x2215C950
PN> MJ> >FP: 0x21332C, RA: 0x2215CC60
PN> MJ> >FP: 0x213398, RA: 0x2215C9DA
PN> MJ> >FP: 0x2133B4, RA: 0x2204DF1E
PN> MJ> >
PN> MJ> >
PN> MJ> >***************************************************
PN> MJ> >******* Information of Last System Crash **********
PN> MJ> >***************************************************
PN> MJ> >
PN> MJ> >
PN> MJ> >as01-91q-mtl#
PN> MJ> >
PN> MJ> >  -/-/-/-/-/-/
PN> MJ> >
PN> MJ> >as05-mtl#sh ver
PN> MJ> >Cisco Internetwork Operating System Software
PN> MJ> >IOS (tm) 5200 Software (C5200-IS-L), Version 11.3(11b)T3,  RELEASE SOFTWARE
PN> MJ> >(fc1)
PN> MJ> >TAC Support: http://www.cisco.com/tac
PN> MJ> >Copyright (c) 1986-2003 by cisco Systems, Inc.
PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong
PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
PN> MJ> >
PN> MJ> >ROM: System Bootstrap, Version 11.1(474A) [jdisimon 104], INTERIM SOFTWARE
PN> MJ> >BOOTFLASH: 5200 Software (AS5200-BOOT-L), Version 11.1(7)AA, EARLY DEPLOYMENT
PN> MJ> >RELEASE SOFTWARE (fc2)
PN> MJ> >
PN> MJ> >as05-mtl uptime is 14 hours, 59 minutes
PN> MJ> >System restarted by error - software forced crash, PC 0x221CEAD2 at 
PN> MJ> >22:46:16 EDT
PN> MJ> >Mon Sep 13 2004
PN> MJ> >System image file is "flash:c5200-is-l.113-11b.T3.bin", booted via flash
PN> MJ> >
PN> MJ> >cisco AS5200 (68030) processor (revision B) with 8192K/4096K bytes of memory.
PN> MJ> >Processor board ID 04277090
PN> MJ> >Bridging software.
PN> MJ> >X.25 software, Version 3.0.0.
PN> MJ> >SuperLAT software copyright 1990 by Meridian Technology Corp).
PN> MJ> >Primary Rate ISDN software, Version 1.1.
PN> MJ> >Mother board with terminator card.
PN> MJ> >1 Ethernet/IEEE 802.3 interface(s)
PN> MJ> >50 Serial network interface(s)
PN> MJ> >48 terminal line(s)
PN> MJ> >2 Channelized T1/PRI port(s)
PN> MJ> >128K bytes of non-volatile configuration memory.
PN> MJ> >8192K bytes of processor board System flash (Read ONLY)
PN> MJ> >4096K bytes of processor board Boot flash (Read/Write)
PN> MJ> >
PN> MJ> >Configuration register is 0x2102
PN> MJ> >
PN> MJ> >as05-mtl#sh stack
PN> MJ> >Minimum process stacks:
PN> MJ> >  Free/Size   Name
PN> MJ> >  1732/2000   Reset ipc queue
PN> MJ> >  2420/4000   Init
PN> MJ> >  1220/2000   Microcom DSP download Process
PN> MJ> >  1452/2000   RADIUS INITCONFIG
PN> MJ> >  2272/4000   Virtual Exec
PN> MJ> >  1012/2000   MAI Action Process
PN> MJ> >  2596/4000   Exec
PN> MJ> >  1672/2000   Async tty Reset
PN> MJ> >
PN> MJ> >Interrupt level stacks:
PN> MJ> >Level    Called Unused/Size  Name
PN> MJ> >   1     5263579   2544/3000  Async (CL-CD2430) transmit interrupts
PN> MJ> >   2     3874824   2200/3000  Async (CD2430/Mica) receive interrupts
PN> MJ> >   3        2007   2908/3000  Serial interface state change interrupt
PN> MJ> >   4     7170726   2364/3000  Network interfaces
PN> MJ> >   5       17261   2896/3000  Console Uart
PN> MJ> >   6           2   2852/3000  DSX1 interface
PN> MJ> >
PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3,  RELEASE SOFTWARE (fc1)
PN> MJ> >TAC Support: http://www.cisco.com/tac
PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
PN> MJ> >
PN> MJ> >
PN> MJ> >Stack trace from system failure:
PN> MJ> >FP: 0x213DEC, RA: 0x221D281C
PN> MJ> >FP: 0x213DF8, RA: 0x221D2AE2
PN> MJ> >FP: 0x213E1C, RA: 0x221C893E
PN> MJ> >FP: 0x213E2C, RA: 0x221C8A0A
PN> MJ> >FP: 0x213E34, RA: 0x2215C950
PN> MJ> >FP: 0x213E44, RA: 0x2215CC60
PN> MJ> >FP: 0x213EB0, RA: 0x2215C9DA
PN> MJ> >FP: 0x213ECC, RA: 0x2204DF1E
PN> MJ> >
PN> MJ> >
PN> MJ> >***************************************************
PN> MJ> >******* Information of Last System Crash **********
PN> MJ> >***************************************************
PN> MJ> >
PN> MJ> >
PN> MJ> >as05-mtl#
PN> MJ> >
PN> MJ> >
PN> MJ> >pn
PN> MJ> >cd /pub; more beer
PN> MJ> >
PN> MJ> >
PN> MJ> >Le 2004-09-14 à 10:31, Aaron Leonard a écrit:
PN> MJ> >
PN> MJ> >AL> Pierre,
PN> MJ> >AL>
PN> MJ> >AL>
PN> MJ> >AL> Please provide the output of "show version" and "show stack"
PN> MJ> >AL> from one of these 5200s after it has crashed and rebooted.
PN> MJ> >AL>
PN> MJ> >AL> Aaron
PN> MJ> >AL>
PN> MJ> >AL> --
PN> MJ> >AL>
PN> MJ> >AL> > hello all,
PN> MJ> >AL>
PN> MJ> >AL> > as of late, we experience sudden an unexplained reboots with the 
PN> MJ> >following
PN> MJ> >AL> > symptoms :
PN> MJ> >AL>
PN> MJ> >AL> > as07-xxx uptime is 15 hours, 17 minutes
PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2 
PN> MJ> >at 14:57:30 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as12-xxx uptime is 15 hours, 50 minutes
PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2 
PN> MJ> >at 14:25:31 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as16-xxx uptime is 15 hours, 32 minutes
PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2 
PN> MJ> >at 14:43:11 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as02-yyy uptime is 18 hours, 42 minutes
PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2 
PN> MJ> >at 11:35:00 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as03-yyy uptime is 17 hours, 39 minutes
PN> MJ> >AL> >     System restarted by error - software forced crash, PC 0x221CEAD2 
PN> MJ> >at 12:38:28 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > (xxx and yyy are 2 different POPs). As you can see, the reboots are 
PN> MJ> >synchronized
PN> MJ> >AL> > in time. This leads me to think we have a reccurence of a problem we 
PN> MJ> >had a few
PN> MJ> >AL> > months back : there is a virus or worm that overwhelms those poor 
PN> MJ> >boxes and
PN> MJ> >AL> > forces a crash. The luser logs into a box, crashes it, hits redial, 
PN> MJ> >crashes
PN> MJ> >AL> > another box, redials again, crashes yet a third box and then quits.
PN> MJ> >AL>
PN> MJ> >AL> > I see two possible solutions :
PN> MJ> >AL> > 1. filter out the exact problem (if I can pinpoint it)
PN> MJ> >AL> > 2. install an IOS that is immune to the problem (and will not 
PN> MJ> >introduce new
PN> MJ> >AL> >    ones, hopefully!)
PN> MJ> >AL>
PN> MJ> >AL> > --More info--
PN> MJ> >AL> > Current IOS is : IOS (tm) 5200 Software (C5200-IS-L), Version 
PN> MJ> >11.3(11b)T3
PN> MJ> >AL> >   System image file is "flash:c5200-is-l.113-11b.T3.bin"
PN> MJ> >AL>
PN> MJ> >AL> > Applied filter is :
PN> MJ> >AL> > interface Group-Async1
PN> MJ> >AL> >  ip unnumbered Ethernet0
PN> MJ> >AL> >  ip access-group 109 in
PN> MJ> >AL>
PN> MJ> >AL> > access-list 109 deny   tcp any any eq 135
PN> MJ> >AL> > access-list 109 deny   tcp any any eq 445
PN> MJ> >AL> > access-list 109 deny   tcp any any eq 5000
PN> MJ> >AL> > access-list 109 deny   icmp any any
PN> MJ> >AL> > access-list 109 permit ip xx.yy.zz.0 0.0.0.255 any
PN> MJ> >AL> > access-list 109 deny   ip any any
PN> MJ> >AL>
PN> MJ> >AL> > (where xx.yy.zz is the netblock this NAS belongs to - this is the 
PN> MJ> >poor man's
PN> MJ> >AL> > reverse-path verify)
PN> MJ> >AL>
PN> MJ> >AL> > Any suggestion of improved filter is welcome. Any suggestion of an 
PN> MJ> >IOS that fits
PN> MJ> >AL> > in
PN> MJ> >AL> >    cisco AS5200 (68030) processor (revision A) with 8192K/4096K 
PN> MJ> >bytes of memory.
PN> MJ> >AL> >    8192K bytes of processor board System flash (Read ONLY)
PN> MJ> >AL> > is alos welcome. (IP/Plus image not necessary - this just what we 
PN> MJ> >have not and
PN> MJ> >AL> > it works, so - not broken, not fixed!)
PN> MJ> >AL>
PN> MJ> >AL> > Thanks !
PN> MJ> >AL>
PN> MJ> >AL> > -------------------------------------------------------------------
PN> MJ> >AL> > Pierre Nepveu, CCNP                    tel: +1 514.380-4289
PN> MJ> >AL> > Administrateur de reseau                    +1 888.INFOVTL x 4289
PN> MJ> >AL> > Ingenierie / Acces Internet            fax: +1 514 899-8452
PN> MJ> >AL> > Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
PN> MJ> >AL> > -------------------------------------------------------------------
PN> MJ> >AL>
PN> MJ> >AL>
PN> MJ> >AL>
PN> MJ> >AL> > _______________________________________________
PN> MJ> >AL> > cisco-nas mailing list
PN> MJ> >AL> > cisco-nas at puck.nether.net
PN> MJ> >AL> > https://puck.nether.net/mailman/listinfo/cisco-nas
PN> MJ> >AL>
PN> MJ> 
PN> MJ> 
PN> 
PN> 
PN> 
PN> _______________________________________________
PN> cisco-nas mailing list
PN> cisco-nas at puck.nether.net
PN> https://puck.nether.net/mailman/listinfo/cisco-nas
PN> 
PN> 





More information about the cisco-nas mailing list