[c-nsp] Cisco 6509 reboots on its own... again...

Youssef Bengelloun-Zahr youssef at 720.fr
Tue Jul 6 09:03:28 EDT 2010


Looks like some of you folks were already hit by this :

http://www.gossamer-threads.com/lists/cisco/nsp/108155

Hum....

Y.



2010/7/6 Youssef Bengelloun-Zahr <youssef at 720.fr>

> Hello,
>
> Small update on this one, got the crashfile info found this :
>
>
> Cache error detected!
>   CPO_ECC     (reg 26/0): 0x000000BE
>   CPO_CACHERI (reg 27/0): 0xA0000000
>   CP0_CAUSE   (reg 13/0): 0x00000400
>
> Real cache error detected.  System will be halted.
>
> Error: Primary data cache, fields: data,
> Actual physical addr 0x00000000,
> virtual address is imprecise.
>
>  Imprecise Data Parity Error
>
>  Imprecise Data Parity Error
>
>  06:23:38 UTC Mon Jul 5 2010: Interrupt exception, CPU signal 20, PC =
> 0x40A94B8C
>
>
>
>
> --------------------------------------------------------------------
>    Possible software fault. Upon reccurence, please collect
>    crashinfo, "show tech" and contact Cisco Technical Support.
> --------------------------------------------------------------------
>
>
> -Traceback= 425F30C8
> $0 : 00000000, AT : 43B40000, v0 : 00000000, v1 : 5503EB98
> a0 : 58CDD7E0, a1 : 502B6C64, a2 : C133A166, a3 : 00000000
> t0 : 502B6C50, t1 : 502B6C24, t2 : 34008100, t3 : FFFF00FF
> t4 : 425EB928, t5 : 00000000, t6 : 00000000, t7 : 47969878
> s0 : FFFE0000, s1 : C1320000, s2 : 511DD9E0, s3 : C133A166
> s4 : FFFFFF00, s5 : 00000008, s6 : 00000011, s7 : 511D999C
> t8 : 502B6C3C, t9 : 00000000, k0 : 5536CD00, k1 : 411410B0
> gp : 43B42D30, sp : 502B6BB0, s8 : 00000000, ra : 40A94B64
> EPC  : 425F30C8, ErrorEPC : 40A94B8C, SREG     : 3400FF05
> MDLO : 00010C20, MDHI     : 00000000, BadVaddr : 00000000
> DATA_START : 0x43788D30
> Cause 00000000 (Code 0x0): Interrupt exception
>
>
> Any ideas ?
>
> Thanks.
>
> Y.
>
>
>
> 2010/7/6 krunal shah <krun.shah at gmail.com>
>
> There must be two crashinfo files for SP and RP and show tech-support. You
>> need to collect it when you contact tech support.
>>
>> TAC usually has decoders from their developer to decode hex values in
>> traceback.
>>
>>
>> -Traceback= 41183348 41180F04 40DADF40 40FFA1CC 40FFA4D8 40752F58 40752F44
>>
>>
>> Krunal
>>
>>
>> On Mon, Jul 5, 2010 at 5:36 AM, Youssef Bengelloun-Zahr <youssef at 720.fr>wrote:
>>
>>> Hello,
>>>
>>> I have a c6509 with redundant SUP720-3BXL
>>> (s72033-advipservicesk9_wan-mz.122-33.SXH2a.bin) that's rebooted on its
>>> own
>>> this morning. FYI, the same router reboot 3 weeks ago unexpectedly !
>>>
>>> Here is a trunkated output of the crashfile info :
>>>
>>> Jun 11 06:48:29.377: %PFREDUN-SP-6-ACTIVE: Standby initializing for SSO
>>> mode
>>> Jun 11 06:48:29.377: %SYS-SP-3-LOGGER_FLUSHING: System pausing to ensure
>>> console debugging output.
>>> Jun 11 06:48:29.377: %PFREDUN-SP-6-ACTIVE: Standby initializing for SSO
>>> mode
>>> Jun 11 06:48:29.569: %SYS-SP-3-LOGGER_FLUSHED: System was paused for
>>> 00:00:00 to ensure console debugging output.
>>> Jun 11 06:48:41.952: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 11 06:49:39.434: %FABRIC-SP-5-CLEAR_BLOCK: Clear block option is off
>>> for
>>> the fabric in slot 5.
>>> Jun 11 06:49:39.530: %FABRIC-SP-5-FABRIC_MODULE_BACKUP: The Switch Fabric
>>> Module in slot 5 became standby
>>> Jun 11 06:49:42.850: %DIAG-SP-6-RUN_COMPLETE: Module 5: Running Complete
>>> Diagnostics...
>>> Jun 11 06:49:44.819: %DIAG-SP-6-DIAG_OK: Module 5: Passed Online
>>> Diagnostics
>>> Jun 11 06:49:48.673: %OIR-SP-6-INSCARD: Card inserted in slot 5,
>>> interfaces
>>> are now online
>>> Jun 11 09:53:37.178: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 11 13:02:59.715: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 11 13:04:16.254: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 14 09:00:28.800: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 14 09:05:08.864: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 17 08:35:59.058: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 17 08:39:58.941: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> CMD: 'sh mls cef summary ' 11:31:24 UTC Thu Jun 17 2010
>>> CMD: 'exit' 11:31:25 UTC Thu Jun 17 2010
>>> CMD: 'sh mls cef statistics ' 11:32:01 UTC Thu Jun 17 2010
>>> CMD: 'sh mls cef maximum-routes ' 11:32:21 UTC Thu Jun 17 2010
>>> CMD: 'sh mls cef rpf ' 11:33:07 UTC Thu Jun 17 2010
>>> CMD: 'show mls acl inconsistency' 12:18:44 UTC Thu Jun 17 2010
>>> Jun 21 08:14:58.161: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 22 08:15:53.784: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 22 11:56:07.044: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 22 11:58:40.637: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 23 11:01:20.484: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 23 12:31:21.556: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> CMD: 'sh mls cef ' 21:30:10 UTC Sun Jun 27 2010
>>> CMD: 'sh mls cef tcam hit ' 21:31:52 UTC Sun Jun 27 2010
>>> Jun 29 11:51:04.876: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>>
>>> %Software-forced reload
>>>
>>>
>>>  06:23:49 UTC Mon Jul 5 2010: Breakpoint exception, CPU signal 23, PC =
>>> 0x41183348
>>>
>>>
>>>
>>> --------------------------------------------------------------------
>>>   Possible software fault. Upon reccurence, please collect
>>>   crashinfo, "show tech" and contact Cisco Technical Support.
>>> --------------------------------------------------------------------
>>>
>>>
>>> -Traceback= 41183348 41180F04 40DADF40 40FFA1CC 40FFA4D8 40752F58
>>> 40752F44
>>> $0 : 00000000, AT : 1E020000, v0 : 43720000, v1 : 00000043
>>> a0 : 447135B0, a1 : 00000043, a2 : 00000009, a3 : 00000000
>>> t0 : 44C7494C, t1 : 44C74948, t2 : 44C74944, t3 : 44C74940
>>> t4 : 44C7493C, t5 : 44C74938, t6 : 44C74934, t7 : 44C74930
>>> s0 : 00000000, s1 : 41DF0000, s2 : 08FA84B0, s3 : 44C74AC0
>>> s4 : 44C74AB8, s5 : 00000000, s6 : 00000000, s7 : 00000000
>>> t8 : 44C7499C, t9 : 00000000, k0 : 470E1200, k1 : 40798CE0
>>> gp : 41E591E0, sp : 44C74A20, s8 : 00000000, ra : 41180F04
>>> EPC  : 41183348, ErrorEPC : 40947F88, SREG     : 3400FF03
>>> MDLO : 333333E8, MDHI     : 000002D3, BadVaddr : 00000000
>>> DATA_START : 0x41C420A0
>>> Cause 00000024 (Code 0x9): Breakpoint exception
>>>
>>>
>>> ========= Start of Crashinfo Collection (06:23:49 UTC Mon Jul 5 2010)
>>> ==========
>>> For image:
>>> Cisco IOS Software, s72033_sp Software (s72033_sp-ADVIPSERVICESK9_WAN-M),
>>> Version 12.2(33)SXH2a, RELEASE SOFTWARE (fc2)
>>> Technical Support: http://www.cisco.com/techsupport
>>> Copyright (c) 1986-2008 by Cisco Systems, Inc.
>>> Compiled Fri 25-Apr-08 08:20 by prod_rel_team
>>>
>>>
>>> ========= Show Alignment
>>> =======================================================
>>>
>>>
>>> No alignment data has been recorded.
>>>
>>> No spurious memory references have been recorded.
>>>
>>>
>>> ========= Additional Subsystem Crashinfo
>>> =======================================
>>>
>>> --------- show redundancy --------
>>>
>>> Switchovers this system has experienced          : 1
>>> Last switchover reason                           : Active crashed.
>>> Uptime since this supervisor switched to active  : 3 weeks, 2 days, 23
>>> hours, 39 minutes
>>> Total system uptime from reload                  : 13 weeks, 2 days, 19
>>> hours, 5 minutes
>>>
>>> Standby is ready to take over
>>>
>>>
>>> ========= Data Inconsistency Errors =========
>>>
>>> No data inconsistency errors have been recorded.
>>>
>>>
>>> --------- show eobc --------
>>>
>>> Interface information:
>>>    Interface EOBC0/0 (idb = 0x44A888B8)
>>>    Hardware is Mistral EOBC (revision 5)
>>>    Address is 0000.0600.0000 (bia 0000.0600.0000)
>>>    Encap size         = 14         hardware status  = 0x210840
>>>    IDB type           = 18         IDB state        = 4
>>>    Encap type         = 0x1        Span encap size  = 0
>>>    Error threshold    = 5000       Error count      = 0
>>>
>>> Counters:
>>>    rxring             = 0x921DD00  rx ring entries       = 512
>>>    rx_head            = 139        rx_tail               = 0
>>>    inputs             = 150953935  rx_cumbytes           = 14190294763
>>>    hw inputs          = 0          hw rx_cumbytes        = 0
>>>    rx rate (bits/sec) = 41000      rx rate (packets/sec) = 53
>>>    rx_buf_unavail     = 0          *rx input drops        = 4397*
>>>    input broadcast    = 150        input resource        = 6815119
>>>    input error        = 0          input giants          = 0
>>>    *input crc          = 4397*       rx illegal length     = 0
>>>    rxr eobc shadow    = 0x50C438F0 txr eobc shadow       = 0x44B94BCC
>>>
>>>    txring             = 0x921FD40  tx ring entries       = 0x200
>>>    tx_head            = 297        tx_tail               = 297
>>>    outputs            = 156727081  tx_cumbytes           = 26897233358
>>>    hw outputs         = 0          hw tx_cumbytes        = 0
>>>    tx rate (bits/sec) = 84000      tx rate (packets/sec) = 56
>>>    tx_retry_error     = 2          tx_retry_count        = 276218
>>>    tx_process_stopped = 0          tx total drops        = 0
>>>
>>> Mistral Registers
>>>    soft_reset_cfg     = 0x000000   dma_buffer_size_reg   = 0x000000
>>>    int_mask_hi        = 0x000076   int_mask_lo           = 0x7001AD8
>>>    rxdscp_cnt         = 425        txdscp_cnt            = 0
>>>    rxwork_dscp        = 0xEB20     txwork_dscp           = 0x688
>>>    mistral_eobc_ds    = 0x44A897C4 mistral_dma_register  = 0x30000000
>>>    mistral_glbl_reg   = 0x10020000
>>>
>>> Misc. Global Registers:
>>>    global_cfg         = 0x20       mis_init_sts          = 0xF
>>>    dimm_parm_cfg_hi   = 0x000003F6 dimm_parm_cfg_lo      = 0x42040F5A
>>>    tm_init_size_cfg   = 0x8000
>>>
>>>
>>> Here is the output of a show version :
>>>
>>> Cisco IOS Software, s72033_rp Software (s72033_rp-ADVIPSERVICESK9_WAN-M),
>>> Version 12.2(33)SXH2a, RELEASE SOFTWARE (fc2)
>>> Technical Support: http://www.cisco.com/techsupport
>>> Copyright (c) 1986-2008 by Cisco Systems, Inc.
>>> Compiled Fri 25-Apr-08 08:07 by prod_rel_team
>>>
>>> ROM: System Bootstrap, Version 12.2(17r)SX5, RELEASE SOFTWARE (fc1)
>>>
>>>  BB1.IX1 uptime is 13 weeks, 2 days, 22 hours, 1 minute
>>> Uptime for this control processor is 3 weeks, 3 days, 2 hours, 26 minutes
>>> Time since BB1.IX1 switched to active is 2 hours, 49 minutes
>>> *System returned to ROM by Stateful Switchover at 07:42:25 UTC Wed May 20
>>> 2009 (SP by reload)
>>> *System restarted at 06:48:20 UTC Fri Jun 11 2010
>>> System image file is
>>> "bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXH2a.bin"
>>>
>>>
>>> This product contains cryptographic features and is subject to United
>>> States and local country laws governing import, export, transfer and
>>> use. Delivery of Cisco cryptographic products does not imply
>>> third-party authority to import, export, distribute or use encryption.
>>> Importers, exporters, distributors and users are responsible for
>>> compliance with U.S. and local country laws. By using this product you
>>> agree to comply with applicable laws and regulations. If you are unable
>>> to comply with U.S. and local laws, return this product immediately.
>>>
>>> A summary of U.S. laws governing Cisco cryptographic products may be
>>> found
>>> at:
>>> http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
>>>
>>> If you require further assistance please contact us by sending email to
>>> export at cisco.com.
>>>
>>> cisco WS-C6509 (R7000) processor (revision 2.0) with 983008K/65536K bytes
>>> of
>>> memory.
>>> Processor board ID SCA043001KB
>>> SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
>>> Last reset from s/w reset
>>> 29 Virtual Ethernet interfaces
>>> 96 FastEthernet interfaces
>>> 124 Gigabit Ethernet interfaces
>>> 1917K bytes of non-volatile configuration memory.
>>> 8192K bytes of packet buffer memory.
>>>
>>> 65536K bytes of Flash internal SIMM (Sector size 512K).
>>> Configuration register is 0x2102
>>>
>>>
>>> I have noticed some EOBC input drops due to CRC. Would this be due to a
>>> chassis default ? It's been running fine for more than two years now.
>>>
>>> I am running the IOS bug toolkit looking for a possible match with my
>>> case.
>>>
>>> Thanks.
>>>
>>> Cheers.
>>>
>>> Y.
>>>
>>> --
>>> Youssef BENGELLOUN-ZAHR ………………………………………………
>>> Ingénieur Réseaux et Télécoms
>>>
>>>
>>> Technopole de l'Aube  en Champagne - BP 601 - 10901 TROYES  Cedex 9
>>> Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
>>> Tel                 +33 (0) 825 000 720
>>> Tel. direct      +33 (0) 1 77 35 59 14
>>> Tel. portable  +33 (0) 6 22 42 63 80
>>> Email            ybz at 720.fr
>>> ……………………………………………………………………………….....www.720.fr
>>> _______________________________________________
>>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>>
>>
>>
>
>
> --
> Youssef BENGELLOUN-ZAHR ………………………………………………
> Ingénieur Réseaux et Télécoms
>
>
> Technopole de l'Aube  en Champagne - BP 601 - 10901 TROYES  Cedex 9
> Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
> Tel                 +33 (0) 825 000 720
> Tel. direct      +33 (0) 1 77 35 59 14
> Tel. portable  +33 (0) 6 22 42 63 80
> Email            ybz at 720.fr
> ……………………………………………………………………………….....www.720.fr
>
>


-- 
Youssef BENGELLOUN-ZAHR ………………………………………………
Ingénieur Réseaux et Télécoms


Technopole de l'Aube  en Champagne - BP 601 - 10901 TROYES  Cedex 9
Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
Tel                 +33 (0) 825 000 720
Tel. direct      +33 (0) 1 77 35 59 14
Tel. portable  +33 (0) 6 22 42 63 80
Email            ybz at 720.fr
……………………………………………………………………………….....www.720.fr


More information about the cisco-nsp mailing list