[c-nsp] 6504-E crash after bringing up lots of BGP sessions
Andy B.
globichen at gmail.com
Thu Dec 3 18:32:12 EST 2009
On Thu, Dec 3, 2009 at 11:54 PM, Eninja <eninja at gmail.com> wrote:
> Andy,
>
> Your snipped 'sh ver' post is inadequate to understand the root cause of
> this problem.
>
> Unicast or broadcast a full 'sh ver' (prior to a reload), 'sh stack', and
> crashinfo files from both SP and RP if available.
>
> eninja
>
Unfortunately that's all the information I've got. No crashinfo has
been generated and while being live inside the console, it did nothing
but reload and the output was:
System Bootstrap, Version 12.2(17r)SX5, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 2006 by cisco Systems, Inc.
System Bootstrap, Version 8.5(2)
Copyright (c) 1994-2007 by cisco Systems, Inc.
Cat6k-Sup720/SP processor with 1048576 Kbytes of main memory
and really nothing before, except showing lots of BGP "Up" messages
from the other routers inside the same AS.
#sh stacks
Minimum process stacks:
Free/Size Name
5704/6000 OIR IOS Process
5536/6000 IPC Zone Manager
5688/6000 ICC Retry Q
4480/6000 IPC delayed init
5704/6000 CDP BLOB
5648/6000 FM HA Sync
5656/6000 L3 Manager HA
5632/6000 Draco FIB process
4536/6000 Delayed Init Late Reg
3528/6000 eobc_init_process
5208/6000 ICC Slave Comp. Up
5584/6000 PM MP Process
2008/3000 EARL INFO CAPABILITY process
5568/6000 DHCPD Receive
5480/6000 C6K ENV RP init
5024/6000 SPAN Subsystem
5416/6000 PostOfficeNet
11464/12000 Router Init
10896/12000 CDP Protocol
11704/12000 cdp init process
8320/12000 Init
5112/6000 Draco DFS Port Registation Proc
4880/6000 IPC LC Port Opener
3864/6000 Update prst
5392/6000 RADIUS INITCONFIG
4856/6000 LCC Configure
4984/6000 SLB RF Active Proc
4688/6000 CEF Reloader
4144/6000 draco-oir-process:slot 1
4224/6000 draco-oir-process:slot 3
4808/6000 BGP Accepter
4272/6000 BGP Open
3992/6000 draco-oir-process:slot 4
2704/3000 Rom Random Update Process
4800/6000 TFTP Read Process
34824/36000 TCP Command
5552/6000 Link Status process
8528/12000 Virtual Exec
8432/12000 SSH Process
8016/12000 Exec
Interrupt level stacks:
Level Called Unused/Size Name
1 1289528 7632/9000 Inband Interrupt
2 379375 7592/9000 EOBC Interrupt
3 10555 8456/9000 Management Interrupt
4 1579543 8600/9000 Console Uart
5 0 9000/9000 Mistral Error Interrupt
7 2637841 8584/9000 NMI Interrupt Handler
***************************************************
******* Information of Last System Crash **********
***************************************************
Using bootflash:crashinfo.
%Error opening bootflash:crashinfo (File not found)
***************************************************
****** Information of Last System Crash - SP ******
***************************************************
The last crashinfo failed to be written.
Please verify the exception crashinfo configuration
the filesytem devices, and the free space on the
filesystem devices.
Using crashinfo_FAILED.
%Error opening crashinfo_FAILED (File not found)
#
Weeks ago, when the same crash happened, I caught this error message
from the console:
*** System received a Software forced crash ***
signal= 0x17, code= 0x24, context= 0x42352a54
PC = 0x402d1e6c, Cause = 0x3020, Status Reg = 0x34008002
System Bootstrap, Version 8.5(2)
Copyright (c) 1994-2007 by cisco Systems, Inc.
Cat6k-Sup720/SP processor with 1048576 Kbytes of main memory
I only saw it once. It never came back on other crashes. A little
research told me that this error does not make sense, because all I
could find was a password reset issue. Nobody has physical access to
this router but me.
I should mention that this router worked fine for more than 15 months.
We are constantly adding new peers and customers to it, so the
workload is growing. But as I said, this is not the busiest router in
my network.
As of now I really have no idea where to look or how I could at least
narrow down the problem.
Andy
More information about the cisco-nsp
mailing list