[cisco-nas] frequent reboot of AS-5200's - update
Pierre Nepveu
pnepveu at videotron.net
Tue Oct 19 17:30:46 EDT 2004
For those interrested... an update
We upgraded 11 boxes to 12.1(25) and 16 MB of memory. It slowed the rythm of
reboots, but did not stop it.
However, we were able to catch one box in the process of going down. We saw that
the free memory was rather stable, but that the largest free block was slowly
decreasing in size. Also :
- the 'Check heaps' process was working like crazy.
- the content of the 'sh ip cache' was enormous (including many /32's)
- more than 50% of all incoming packets on Async ports were dropped by
this ACL:
access-list 109 deny tcp any any eq 135
access-list 109 deny tcp any any eq 445
access-list 109 deny tcp any any eq 5000
access-list 109 deny icmp any any
access-list 109 permit ip x.x.x.0 0.0.0.255 any ! poor-man's rpf
access-list 109 deny ip any any
When the largest memory went below a certain level (something like 1 or 2 K),
the box simply rebooted. It could occur many times a day.
All the symptoms pointed to a worm trying to spread.
We changed the first line of the ACL this way :
access-list 109 deny tcp any any range 135 139
and we were now blocking more than 70% of incoming packets and stopped memory
degradation. Boxes are now stable (we call 'rebooting only once a week' stable,
in this case!). We will upgrade the more problematic boxes to 16MB, it should
give them enough breathing room to last a few weeks... until the next worm...
Anyway... this was to share the info and to thank all that offered help !
-------------------------------------------------------------------
Pierre Nepveu, CCNP tel: +1 514.380-4289
Administrateur de reseau +1 888.INFOVTL x 4289
Ingenierie / Acces Internet fax: +1 514 899-8452
Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
-------------------------------------------------------------------
Le 2004-09-14 à 17:39, Pierre Nepveu a écrit:
PN> Mark,
PN>
PN> thanks. I'll try that and keep you posted.
PN>
PN> pn
PN> cd /pub; more beer
PN>
PN>
PN> Le 2004-09-14 à 11:09, Mark Johnson a écrit:
PN>
PN> MJ> At 01:51 PM 9/14/2004 -0400, Pierre Nepveu wrote:
PN> MJ> >hi guys,
PN> MJ> >
PN> MJ> >you both asked for <sh stack> within 7 minutes of one another ! We should
PN> MJ> >have a
PN> MJ> >new Olympic event : synchronized help desk just for you !
PN> MJ>
PN> MJ> There are quite a few bugs that may relate to this sort of problem (the
PN> MJ> cause is likely ISDN and memory utilization), one of which is CSCdp40742,
PN> MJ> fixed in 12.1(9).
PN> MJ>
PN> MJ> Is it possible to upgrade at least one of your router's (say to the latest
PN> MJ> 12.1), to see how that reacts? Seems like you should no soon if it made a
PN> MJ> positive improvement.
PN> MJ>
PN> MJ> mark
PN> MJ>
PN> MJ>
PN> MJ> >Here goes :
PN> MJ> >(this one is not very fresh... about 1 1/2 day. Below is the freshest I've
PN> MJ> >got -
PN> MJ> >about 5 hours. Thanks for your help!)
PN> MJ> >
PN> MJ> >as01-91q-mtl#sh stack
PN> MJ> >Minimum process stacks:
PN> MJ> > Free/Size Name
PN> MJ> > 1732/2000 Reset ipc queue
PN> MJ> > 2420/4000 Init
PN> MJ> > 1220/2000 Microcom DSP download Process
PN> MJ> > 1456/2000 RADIUS INITCONFIG
PN> MJ> > 1012/2000 MAI Action Process
PN> MJ> > 2884/4000 Exec
PN> MJ> > 1672/2000 Async tty Reset
PN> MJ> > 2272/4000 Virtual Exec
PN> MJ> >
PN> MJ> >Interrupt level stacks:
PN> MJ> >Level Called Unused/Size Name
PN> MJ> > 1 17193915 2544/3000 Async (CL-CD2430) transmit interrupts
PN> MJ> > 2 20615196 2200/3000 Async (CD2430/Mica) receive interrupts
PN> MJ> > 3 4596 2908/3000 Serial interface state change interrupt
PN> MJ> > 4 22397619 2364/3000 Network interfaces
PN> MJ> > 5 21868 2896/3000 Console Uart
PN> MJ> > 6 2 2852/3000 DSX1 interface
PN> MJ> >
PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3, RELEASE SOFTWARE (fc1)
PN> MJ> >TAC Support: http://www.cisco.com/tac
PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
PN> MJ> >
PN> MJ> >
PN> MJ> >Stack trace from system failure:
PN> MJ> >FP: 0x2132D4, RA: 0x221D281C
PN> MJ> >FP: 0x2132E0, RA: 0x221D2AE2
PN> MJ> >FP: 0x213304, RA: 0x221C893E
PN> MJ> >FP: 0x213314, RA: 0x221C8A0A
PN> MJ> >FP: 0x21331C, RA: 0x2215C950
PN> MJ> >FP: 0x21332C, RA: 0x2215CC60
PN> MJ> >FP: 0x213398, RA: 0x2215C9DA
PN> MJ> >FP: 0x2133B4, RA: 0x2204DF1E
PN> MJ> >
PN> MJ> >
PN> MJ> >***************************************************
PN> MJ> >******* Information of Last System Crash **********
PN> MJ> >***************************************************
PN> MJ> >
PN> MJ> >
PN> MJ> >as01-91q-mtl#
PN> MJ> >
PN> MJ> > -/-/-/-/-/-/
PN> MJ> >
PN> MJ> >as05-mtl#sh ver
PN> MJ> >Cisco Internetwork Operating System Software
PN> MJ> >IOS (tm) 5200 Software (C5200-IS-L), Version 11.3(11b)T3, RELEASE SOFTWARE
PN> MJ> >(fc1)
PN> MJ> >TAC Support: http://www.cisco.com/tac
PN> MJ> >Copyright (c) 1986-2003 by cisco Systems, Inc.
PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong
PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
PN> MJ> >
PN> MJ> >ROM: System Bootstrap, Version 11.1(474A) [jdisimon 104], INTERIM SOFTWARE
PN> MJ> >BOOTFLASH: 5200 Software (AS5200-BOOT-L), Version 11.1(7)AA, EARLY DEPLOYMENT
PN> MJ> >RELEASE SOFTWARE (fc2)
PN> MJ> >
PN> MJ> >as05-mtl uptime is 14 hours, 59 minutes
PN> MJ> >System restarted by error - software forced crash, PC 0x221CEAD2 at
PN> MJ> >22:46:16 EDT
PN> MJ> >Mon Sep 13 2004
PN> MJ> >System image file is "flash:c5200-is-l.113-11b.T3.bin", booted via flash
PN> MJ> >
PN> MJ> >cisco AS5200 (68030) processor (revision B) with 8192K/4096K bytes of memory.
PN> MJ> >Processor board ID 04277090
PN> MJ> >Bridging software.
PN> MJ> >X.25 software, Version 3.0.0.
PN> MJ> >SuperLAT software copyright 1990 by Meridian Technology Corp).
PN> MJ> >Primary Rate ISDN software, Version 1.1.
PN> MJ> >Mother board with terminator card.
PN> MJ> >1 Ethernet/IEEE 802.3 interface(s)
PN> MJ> >50 Serial network interface(s)
PN> MJ> >48 terminal line(s)
PN> MJ> >2 Channelized T1/PRI port(s)
PN> MJ> >128K bytes of non-volatile configuration memory.
PN> MJ> >8192K bytes of processor board System flash (Read ONLY)
PN> MJ> >4096K bytes of processor board Boot flash (Read/Write)
PN> MJ> >
PN> MJ> >Configuration register is 0x2102
PN> MJ> >
PN> MJ> >as05-mtl#sh stack
PN> MJ> >Minimum process stacks:
PN> MJ> > Free/Size Name
PN> MJ> > 1732/2000 Reset ipc queue
PN> MJ> > 2420/4000 Init
PN> MJ> > 1220/2000 Microcom DSP download Process
PN> MJ> > 1452/2000 RADIUS INITCONFIG
PN> MJ> > 2272/4000 Virtual Exec
PN> MJ> > 1012/2000 MAI Action Process
PN> MJ> > 2596/4000 Exec
PN> MJ> > 1672/2000 Async tty Reset
PN> MJ> >
PN> MJ> >Interrupt level stacks:
PN> MJ> >Level Called Unused/Size Name
PN> MJ> > 1 5263579 2544/3000 Async (CL-CD2430) transmit interrupts
PN> MJ> > 2 3874824 2200/3000 Async (CD2430/Mica) receive interrupts
PN> MJ> > 3 2007 2908/3000 Serial interface state change interrupt
PN> MJ> > 4 7170726 2364/3000 Network interfaces
PN> MJ> > 5 17261 2896/3000 Console Uart
PN> MJ> > 6 2 2852/3000 DSX1 interface
PN> MJ> >
PN> MJ> >System was restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >5200 Software (C5200-IS-L), Version 11.3(11b)T3, RELEASE SOFTWARE (fc1)
PN> MJ> >TAC Support: http://www.cisco.com/tac
PN> MJ> >Compiled Tue 22-Jul-03 18:16 by hqluong (current version)
PN> MJ> >Image text-base: 0x22033FAC, data-base: 0x00005000
PN> MJ> >
PN> MJ> >
PN> MJ> >Stack trace from system failure:
PN> MJ> >FP: 0x213DEC, RA: 0x221D281C
PN> MJ> >FP: 0x213DF8, RA: 0x221D2AE2
PN> MJ> >FP: 0x213E1C, RA: 0x221C893E
PN> MJ> >FP: 0x213E2C, RA: 0x221C8A0A
PN> MJ> >FP: 0x213E34, RA: 0x2215C950
PN> MJ> >FP: 0x213E44, RA: 0x2215CC60
PN> MJ> >FP: 0x213EB0, RA: 0x2215C9DA
PN> MJ> >FP: 0x213ECC, RA: 0x2204DF1E
PN> MJ> >
PN> MJ> >
PN> MJ> >***************************************************
PN> MJ> >******* Information of Last System Crash **********
PN> MJ> >***************************************************
PN> MJ> >
PN> MJ> >
PN> MJ> >as05-mtl#
PN> MJ> >
PN> MJ> >
PN> MJ> >pn
PN> MJ> >cd /pub; more beer
PN> MJ> >
PN> MJ> >
PN> MJ> >Le 2004-09-14 à 10:31, Aaron Leonard a écrit:
PN> MJ> >
PN> MJ> >AL> Pierre,
PN> MJ> >AL>
PN> MJ> >AL>
PN> MJ> >AL> Please provide the output of "show version" and "show stack"
PN> MJ> >AL> from one of these 5200s after it has crashed and rebooted.
PN> MJ> >AL>
PN> MJ> >AL> Aaron
PN> MJ> >AL>
PN> MJ> >AL> --
PN> MJ> >AL>
PN> MJ> >AL> > hello all,
PN> MJ> >AL>
PN> MJ> >AL> > as of late, we experience sudden an unexplained reboots with the
PN> MJ> >following
PN> MJ> >AL> > symptoms :
PN> MJ> >AL>
PN> MJ> >AL> > as07-xxx uptime is 15 hours, 17 minutes
PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >at 14:57:30 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as12-xxx uptime is 15 hours, 50 minutes
PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >at 14:25:31 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as16-xxx uptime is 15 hours, 32 minutes
PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >at 14:43:11 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as02-yyy uptime is 18 hours, 42 minutes
PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >at 11:35:00 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > as03-yyy uptime is 17 hours, 39 minutes
PN> MJ> >AL> > System restarted by error - software forced crash, PC 0x221CEAD2
PN> MJ> >at 12:38:28 EDT Sun Sep 5 2004
PN> MJ> >AL>
PN> MJ> >AL> > (xxx and yyy are 2 different POPs). As you can see, the reboots are
PN> MJ> >synchronized
PN> MJ> >AL> > in time. This leads me to think we have a reccurence of a problem we
PN> MJ> >had a few
PN> MJ> >AL> > months back : there is a virus or worm that overwhelms those poor
PN> MJ> >boxes and
PN> MJ> >AL> > forces a crash. The luser logs into a box, crashes it, hits redial,
PN> MJ> >crashes
PN> MJ> >AL> > another box, redials again, crashes yet a third box and then quits.
PN> MJ> >AL>
PN> MJ> >AL> > I see two possible solutions :
PN> MJ> >AL> > 1. filter out the exact problem (if I can pinpoint it)
PN> MJ> >AL> > 2. install an IOS that is immune to the problem (and will not
PN> MJ> >introduce new
PN> MJ> >AL> > ones, hopefully!)
PN> MJ> >AL>
PN> MJ> >AL> > --More info--
PN> MJ> >AL> > Current IOS is : IOS (tm) 5200 Software (C5200-IS-L), Version
PN> MJ> >11.3(11b)T3
PN> MJ> >AL> > System image file is "flash:c5200-is-l.113-11b.T3.bin"
PN> MJ> >AL>
PN> MJ> >AL> > Applied filter is :
PN> MJ> >AL> > interface Group-Async1
PN> MJ> >AL> > ip unnumbered Ethernet0
PN> MJ> >AL> > ip access-group 109 in
PN> MJ> >AL>
PN> MJ> >AL> > access-list 109 deny tcp any any eq 135
PN> MJ> >AL> > access-list 109 deny tcp any any eq 445
PN> MJ> >AL> > access-list 109 deny tcp any any eq 5000
PN> MJ> >AL> > access-list 109 deny icmp any any
PN> MJ> >AL> > access-list 109 permit ip xx.yy.zz.0 0.0.0.255 any
PN> MJ> >AL> > access-list 109 deny ip any any
PN> MJ> >AL>
PN> MJ> >AL> > (where xx.yy.zz is the netblock this NAS belongs to - this is the
PN> MJ> >poor man's
PN> MJ> >AL> > reverse-path verify)
PN> MJ> >AL>
PN> MJ> >AL> > Any suggestion of improved filter is welcome. Any suggestion of an
PN> MJ> >IOS that fits
PN> MJ> >AL> > in
PN> MJ> >AL> > cisco AS5200 (68030) processor (revision A) with 8192K/4096K
PN> MJ> >bytes of memory.
PN> MJ> >AL> > 8192K bytes of processor board System flash (Read ONLY)
PN> MJ> >AL> > is alos welcome. (IP/Plus image not necessary - this just what we
PN> MJ> >have not and
PN> MJ> >AL> > it works, so - not broken, not fixed!)
PN> MJ> >AL>
PN> MJ> >AL> > Thanks !
PN> MJ> >AL>
PN> MJ> >AL> > -------------------------------------------------------------------
PN> MJ> >AL> > Pierre Nepveu, CCNP tel: +1 514.380-4289
PN> MJ> >AL> > Administrateur de reseau +1 888.INFOVTL x 4289
PN> MJ> >AL> > Ingenierie / Acces Internet fax: +1 514 899-8452
PN> MJ> >AL> > Videotron Telecom Ltee (VTL) - Montreal (Quebec), Canada
PN> MJ> >AL> > -------------------------------------------------------------------
PN> MJ> >AL>
PN> MJ> >AL>
PN> MJ> >AL>
PN> MJ> >AL> > _______________________________________________
PN> MJ> >AL> > cisco-nas mailing list
PN> MJ> >AL> > cisco-nas at puck.nether.net
PN> MJ> >AL> > https://puck.nether.net/mailman/listinfo/cisco-nas
PN> MJ> >AL>
PN> MJ>
PN> MJ>
PN>
PN>
PN>
PN> _______________________________________________
PN> cisco-nas mailing list
PN> cisco-nas at puck.nether.net
PN> https://puck.nether.net/mailman/listinfo/cisco-nas
PN>
PN>
More information about the cisco-nas
mailing list