[c-nsp] 'show isis database' delayed crash on 15.0(1)M1

Łukasz Bromirski lukasz at bromirski.net
Sun Feb 7 03:45:13 EST 2010


On 2010-02-07 02:55, Bryan Fields wrote:
> I was trouble shooing my network today and found a nasty little bug when some
> one does 'show isis database' from exec mode on C181X Software
> (C181X-ADVIPSERVICESK9-M), Version 15.0(1)M1, IOS.
> After issuing the command you get the output of it, and some time in the next
> 30 sec the router crashes.
> example:
> LTRKAKHQR01-c1811w#sh isis database

Hard to reproduce or something else is causing the crash, I just tried
this on my farm of 9 different 18xx and no crash at all:

c180x#sh ver | i IOS
Cisco IOS Software, C180X Software (C180X-ADVENTERPRISEK9-M), Version
15.0(1)M, RELEASE SOFTWARE (fc2)

c180x#sh isis database

IS-IS Level-2 Link State Database:
LSPID                 LSP Seq Num  LSP Checksum  LSP Holdtime      ATT/P/OL
c180x.00-00         * 0x00002DA1   0x2975        1142              0/0/0
tor-core.00-00        0x00002D98   0xCD09        1073              0/0/0
w-ts.00-00            0x00001019   0x899B        584               0/0/0
w-ts.01-00            0x00001015   0xAEB4        863               0/0/0
c180x#sh clock
09:40:26.110 CET Sun Feb 7 2010
c180x#sh clock
09:40:32.818 CET Sun Feb 7 2010
c180x#sh clock
09:40:41.810 CET Sun Feb 7 2010
c180x#sh clock
09:40:48.898 CET Sun Feb 7 2010
c180x#sh clock
09:40:56.338 CET Sun Feb 7 2010
c180x#sh clock
09:41:02.018 CET Sun Feb 7 2010
c180x#sh clock
09:41:07.971 CET Sun Feb 7 2010
c180x#sh clock
09:41:12.963 CET Sun Feb 7 2010

> from the log output:
> Feb  6 20:45:27 192.168.3.210 103: LTRKAKHQR01-c1811w: Feb  7 2010 01:45:20
> UTC: %SYS-3-CPUHOG: Task is running for (2000)msecs, more than (2000)msecs
> (0/0),process = Check heaps.
> UTC: %SYS-3-CPUHOG: Task is running for (4000)msecs, more than (2000)msecs
> (0/0),process = Check heaps.
> Feb  6 20:45:27 192.168.3.210 107: LTRKAKHQR01-c1811w: Feb  7 2010 01:45:24
> UTC: %SYS-3-BADMAGIC: Corrupt block at 86AC28DC (magic 813E0508),  -Traceback=
> 0x82052388z 0x82052770z 0x82055410z 0x820555CCz 0x8012086Cz 0x80124418z
> Feb  6 20:45:27 192.168.3.210 108: LTRKAKHQR01-c1811w: Feb  7 2010 01:45:24
> UTC: %SYS-6-MTRACE: mallocfree: addr, pc
> UTC: %SYS-6-BLKINFO: Corrupted magic value in in-use block blk 86AC28DC, words
> 6002, alloc 8012DAC4, InUse, dealloc FFFFFFFF, rfcnt 1,  -Traceback=
> 0x82010150z 0x82052618z 0x82052770z 0x82055410z 0x820555CCz 0x8012086Cz
> 0x80124418z
> Feb  6 20:45:28 192.168.3.210 118: LTRKAKHQR01-c1811w: Feb  7 2010 01:45:24
> UTC: %SYS-6-STACKLOW: Stack for process Virtual Exec running low, 12/12000
> Feb  6 20:45:40 192.168.3.250 1038: TAMQFLTART1: Feb  7 2010 01:45:39 UTC:
> %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel3, changed state to down

Some process is behaving badly, if the Check Heaps has a
problem validating the alignments. Then it seems
something writes some gibberish out of it's memory slice
and then things start to fall down.

> %Software-forced reload
>  21:24:12 UTC Sat Feb 6 2010: Unexpected exception to CPU: vector 1500, PC =
> 0x8011E220, LR = 0x8011E1E4

> Any one else seen this or know if it's a known bug?  I've searched the cisco
> site and cannot find a reference to this issue.

Open a case. Have it reproduced and then nailed down to some
specific bug.

-- 
"Everything will be okay in the end. |                  Łukasz Bromirski
 If it's not okay, it's not the end. |       http://lukasz.bromirski.net


More information about the cisco-nsp mailing list