[c-nsp] Cisco 6509 reboots on its own... again...
Youssef Bengelloun-Zahr
youssef at 720.fr
Tue Jul 6 09:03:28 EDT 2010
Looks like some of you folks were already hit by this :
http://www.gossamer-threads.com/lists/cisco/nsp/108155
Hum....
Y.
2010/7/6 Youssef Bengelloun-Zahr <youssef at 720.fr>
> Hello,
>
> Small update on this one, got the crashfile info found this :
>
>
> Cache error detected!
> CPO_ECC (reg 26/0): 0x000000BE
> CPO_CACHERI (reg 27/0): 0xA0000000
> CP0_CAUSE (reg 13/0): 0x00000400
>
> Real cache error detected. System will be halted.
>
> Error: Primary data cache, fields: data,
> Actual physical addr 0x00000000,
> virtual address is imprecise.
>
> Imprecise Data Parity Error
>
> Imprecise Data Parity Error
>
> 06:23:38 UTC Mon Jul 5 2010: Interrupt exception, CPU signal 20, PC =
> 0x40A94B8C
>
>
>
>
> --------------------------------------------------------------------
> Possible software fault. Upon reccurence, please collect
> crashinfo, "show tech" and contact Cisco Technical Support.
> --------------------------------------------------------------------
>
>
> -Traceback= 425F30C8
> $0 : 00000000, AT : 43B40000, v0 : 00000000, v1 : 5503EB98
> a0 : 58CDD7E0, a1 : 502B6C64, a2 : C133A166, a3 : 00000000
> t0 : 502B6C50, t1 : 502B6C24, t2 : 34008100, t3 : FFFF00FF
> t4 : 425EB928, t5 : 00000000, t6 : 00000000, t7 : 47969878
> s0 : FFFE0000, s1 : C1320000, s2 : 511DD9E0, s3 : C133A166
> s4 : FFFFFF00, s5 : 00000008, s6 : 00000011, s7 : 511D999C
> t8 : 502B6C3C, t9 : 00000000, k0 : 5536CD00, k1 : 411410B0
> gp : 43B42D30, sp : 502B6BB0, s8 : 00000000, ra : 40A94B64
> EPC : 425F30C8, ErrorEPC : 40A94B8C, SREG : 3400FF05
> MDLO : 00010C20, MDHI : 00000000, BadVaddr : 00000000
> DATA_START : 0x43788D30
> Cause 00000000 (Code 0x0): Interrupt exception
>
>
> Any ideas ?
>
> Thanks.
>
> Y.
>
>
>
> 2010/7/6 krunal shah <krun.shah at gmail.com>
>
> There must be two crashinfo files for SP and RP and show tech-support. You
>> need to collect it when you contact tech support.
>>
>> TAC usually has decoders from their developer to decode hex values in
>> traceback.
>>
>>
>> -Traceback= 41183348 41180F04 40DADF40 40FFA1CC 40FFA4D8 40752F58 40752F44
>>
>>
>> Krunal
>>
>>
>> On Mon, Jul 5, 2010 at 5:36 AM, Youssef Bengelloun-Zahr <youssef at 720.fr>wrote:
>>
>>> Hello,
>>>
>>> I have a c6509 with redundant SUP720-3BXL
>>> (s72033-advipservicesk9_wan-mz.122-33.SXH2a.bin) that's rebooted on its
>>> own
>>> this morning. FYI, the same router reboot 3 weeks ago unexpectedly !
>>>
>>> Here is a trunkated output of the crashfile info :
>>>
>>> Jun 11 06:48:29.377: %PFREDUN-SP-6-ACTIVE: Standby initializing for SSO
>>> mode
>>> Jun 11 06:48:29.377: %SYS-SP-3-LOGGER_FLUSHING: System pausing to ensure
>>> console debugging output.
>>> Jun 11 06:48:29.377: %PFREDUN-SP-6-ACTIVE: Standby initializing for SSO
>>> mode
>>> Jun 11 06:48:29.569: %SYS-SP-3-LOGGER_FLUSHED: System was paused for
>>> 00:00:00 to ensure console debugging output.
>>> Jun 11 06:48:41.952: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 11 06:49:39.434: %FABRIC-SP-5-CLEAR_BLOCK: Clear block option is off
>>> for
>>> the fabric in slot 5.
>>> Jun 11 06:49:39.530: %FABRIC-SP-5-FABRIC_MODULE_BACKUP: The Switch Fabric
>>> Module in slot 5 became standby
>>> Jun 11 06:49:42.850: %DIAG-SP-6-RUN_COMPLETE: Module 5: Running Complete
>>> Diagnostics...
>>> Jun 11 06:49:44.819: %DIAG-SP-6-DIAG_OK: Module 5: Passed Online
>>> Diagnostics
>>> Jun 11 06:49:48.673: %OIR-SP-6-INSCARD: Card inserted in slot 5,
>>> interfaces
>>> are now online
>>> Jun 11 09:53:37.178: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 11 13:02:59.715: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 11 13:04:16.254: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 14 09:00:28.800: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 14 09:05:08.864: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 17 08:35:59.058: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 17 08:39:58.941: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> CMD: 'sh mls cef summary ' 11:31:24 UTC Thu Jun 17 2010
>>> CMD: 'exit' 11:31:25 UTC Thu Jun 17 2010
>>> CMD: 'sh mls cef statistics ' 11:32:01 UTC Thu Jun 17 2010
>>> CMD: 'sh mls cef maximum-routes ' 11:32:21 UTC Thu Jun 17 2010
>>> CMD: 'sh mls cef rpf ' 11:33:07 UTC Thu Jun 17 2010
>>> CMD: 'show mls acl inconsistency' 12:18:44 UTC Thu Jun 17 2010
>>> Jun 21 08:14:58.161: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 22 08:15:53.784: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 22 11:56:07.044: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 22 11:58:40.637: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 23 11:01:20.484: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> Jun 23 12:31:21.556: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>> CMD: 'sh mls cef ' 21:30:10 UTC Sun Jun 27 2010
>>> CMD: 'sh mls cef tcam hit ' 21:31:52 UTC Sun Jun 27 2010
>>> Jun 29 11:51:04.876: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup
>>> configuration to the standby Router.
>>>
>>> %Software-forced reload
>>>
>>>
>>> 06:23:49 UTC Mon Jul 5 2010: Breakpoint exception, CPU signal 23, PC =
>>> 0x41183348
>>>
>>>
>>>
>>> --------------------------------------------------------------------
>>> Possible software fault. Upon reccurence, please collect
>>> crashinfo, "show tech" and contact Cisco Technical Support.
>>> --------------------------------------------------------------------
>>>
>>>
>>> -Traceback= 41183348 41180F04 40DADF40 40FFA1CC 40FFA4D8 40752F58
>>> 40752F44
>>> $0 : 00000000, AT : 1E020000, v0 : 43720000, v1 : 00000043
>>> a0 : 447135B0, a1 : 00000043, a2 : 00000009, a3 : 00000000
>>> t0 : 44C7494C, t1 : 44C74948, t2 : 44C74944, t3 : 44C74940
>>> t4 : 44C7493C, t5 : 44C74938, t6 : 44C74934, t7 : 44C74930
>>> s0 : 00000000, s1 : 41DF0000, s2 : 08FA84B0, s3 : 44C74AC0
>>> s4 : 44C74AB8, s5 : 00000000, s6 : 00000000, s7 : 00000000
>>> t8 : 44C7499C, t9 : 00000000, k0 : 470E1200, k1 : 40798CE0
>>> gp : 41E591E0, sp : 44C74A20, s8 : 00000000, ra : 41180F04
>>> EPC : 41183348, ErrorEPC : 40947F88, SREG : 3400FF03
>>> MDLO : 333333E8, MDHI : 000002D3, BadVaddr : 00000000
>>> DATA_START : 0x41C420A0
>>> Cause 00000024 (Code 0x9): Breakpoint exception
>>>
>>>
>>> ========= Start of Crashinfo Collection (06:23:49 UTC Mon Jul 5 2010)
>>> ==========
>>> For image:
>>> Cisco IOS Software, s72033_sp Software (s72033_sp-ADVIPSERVICESK9_WAN-M),
>>> Version 12.2(33)SXH2a, RELEASE SOFTWARE (fc2)
>>> Technical Support: http://www.cisco.com/techsupport
>>> Copyright (c) 1986-2008 by Cisco Systems, Inc.
>>> Compiled Fri 25-Apr-08 08:20 by prod_rel_team
>>>
>>>
>>> ========= Show Alignment
>>> =======================================================
>>>
>>>
>>> No alignment data has been recorded.
>>>
>>> No spurious memory references have been recorded.
>>>
>>>
>>> ========= Additional Subsystem Crashinfo
>>> =======================================
>>>
>>> --------- show redundancy --------
>>>
>>> Switchovers this system has experienced : 1
>>> Last switchover reason : Active crashed.
>>> Uptime since this supervisor switched to active : 3 weeks, 2 days, 23
>>> hours, 39 minutes
>>> Total system uptime from reload : 13 weeks, 2 days, 19
>>> hours, 5 minutes
>>>
>>> Standby is ready to take over
>>>
>>>
>>> ========= Data Inconsistency Errors =========
>>>
>>> No data inconsistency errors have been recorded.
>>>
>>>
>>> --------- show eobc --------
>>>
>>> Interface information:
>>> Interface EOBC0/0 (idb = 0x44A888B8)
>>> Hardware is Mistral EOBC (revision 5)
>>> Address is 0000.0600.0000 (bia 0000.0600.0000)
>>> Encap size = 14 hardware status = 0x210840
>>> IDB type = 18 IDB state = 4
>>> Encap type = 0x1 Span encap size = 0
>>> Error threshold = 5000 Error count = 0
>>>
>>> Counters:
>>> rxring = 0x921DD00 rx ring entries = 512
>>> rx_head = 139 rx_tail = 0
>>> inputs = 150953935 rx_cumbytes = 14190294763
>>> hw inputs = 0 hw rx_cumbytes = 0
>>> rx rate (bits/sec) = 41000 rx rate (packets/sec) = 53
>>> rx_buf_unavail = 0 *rx input drops = 4397*
>>> input broadcast = 150 input resource = 6815119
>>> input error = 0 input giants = 0
>>> *input crc = 4397* rx illegal length = 0
>>> rxr eobc shadow = 0x50C438F0 txr eobc shadow = 0x44B94BCC
>>>
>>> txring = 0x921FD40 tx ring entries = 0x200
>>> tx_head = 297 tx_tail = 297
>>> outputs = 156727081 tx_cumbytes = 26897233358
>>> hw outputs = 0 hw tx_cumbytes = 0
>>> tx rate (bits/sec) = 84000 tx rate (packets/sec) = 56
>>> tx_retry_error = 2 tx_retry_count = 276218
>>> tx_process_stopped = 0 tx total drops = 0
>>>
>>> Mistral Registers
>>> soft_reset_cfg = 0x000000 dma_buffer_size_reg = 0x000000
>>> int_mask_hi = 0x000076 int_mask_lo = 0x7001AD8
>>> rxdscp_cnt = 425 txdscp_cnt = 0
>>> rxwork_dscp = 0xEB20 txwork_dscp = 0x688
>>> mistral_eobc_ds = 0x44A897C4 mistral_dma_register = 0x30000000
>>> mistral_glbl_reg = 0x10020000
>>>
>>> Misc. Global Registers:
>>> global_cfg = 0x20 mis_init_sts = 0xF
>>> dimm_parm_cfg_hi = 0x000003F6 dimm_parm_cfg_lo = 0x42040F5A
>>> tm_init_size_cfg = 0x8000
>>>
>>>
>>> Here is the output of a show version :
>>>
>>> Cisco IOS Software, s72033_rp Software (s72033_rp-ADVIPSERVICESK9_WAN-M),
>>> Version 12.2(33)SXH2a, RELEASE SOFTWARE (fc2)
>>> Technical Support: http://www.cisco.com/techsupport
>>> Copyright (c) 1986-2008 by Cisco Systems, Inc.
>>> Compiled Fri 25-Apr-08 08:07 by prod_rel_team
>>>
>>> ROM: System Bootstrap, Version 12.2(17r)SX5, RELEASE SOFTWARE (fc1)
>>>
>>> BB1.IX1 uptime is 13 weeks, 2 days, 22 hours, 1 minute
>>> Uptime for this control processor is 3 weeks, 3 days, 2 hours, 26 minutes
>>> Time since BB1.IX1 switched to active is 2 hours, 49 minutes
>>> *System returned to ROM by Stateful Switchover at 07:42:25 UTC Wed May 20
>>> 2009 (SP by reload)
>>> *System restarted at 06:48:20 UTC Fri Jun 11 2010
>>> System image file is
>>> "bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXH2a.bin"
>>>
>>>
>>> This product contains cryptographic features and is subject to United
>>> States and local country laws governing import, export, transfer and
>>> use. Delivery of Cisco cryptographic products does not imply
>>> third-party authority to import, export, distribute or use encryption.
>>> Importers, exporters, distributors and users are responsible for
>>> compliance with U.S. and local country laws. By using this product you
>>> agree to comply with applicable laws and regulations. If you are unable
>>> to comply with U.S. and local laws, return this product immediately.
>>>
>>> A summary of U.S. laws governing Cisco cryptographic products may be
>>> found
>>> at:
>>> http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
>>>
>>> If you require further assistance please contact us by sending email to
>>> export at cisco.com.
>>>
>>> cisco WS-C6509 (R7000) processor (revision 2.0) with 983008K/65536K bytes
>>> of
>>> memory.
>>> Processor board ID SCA043001KB
>>> SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
>>> Last reset from s/w reset
>>> 29 Virtual Ethernet interfaces
>>> 96 FastEthernet interfaces
>>> 124 Gigabit Ethernet interfaces
>>> 1917K bytes of non-volatile configuration memory.
>>> 8192K bytes of packet buffer memory.
>>>
>>> 65536K bytes of Flash internal SIMM (Sector size 512K).
>>> Configuration register is 0x2102
>>>
>>>
>>> I have noticed some EOBC input drops due to CRC. Would this be due to a
>>> chassis default ? It's been running fine for more than two years now.
>>>
>>> I am running the IOS bug toolkit looking for a possible match with my
>>> case.
>>>
>>> Thanks.
>>>
>>> Cheers.
>>>
>>> Y.
>>>
>>> --
>>> Youssef BENGELLOUN-ZAHR ………………………………………………
>>> Ingénieur Réseaux et Télécoms
>>>
>>>
>>> Technopole de l'Aube en Champagne - BP 601 - 10901 TROYES Cedex 9
>>> Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
>>> Tel +33 (0) 825 000 720
>>> Tel. direct +33 (0) 1 77 35 59 14
>>> Tel. portable +33 (0) 6 22 42 63 80
>>> Email ybz at 720.fr
>>> ……………………………………………………………………………….....www.720.fr
>>> _______________________________________________
>>> cisco-nsp mailing list cisco-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>>
>>
>>
>
>
> --
> Youssef BENGELLOUN-ZAHR ………………………………………………
> Ingénieur Réseaux et Télécoms
>
>
> Technopole de l'Aube en Champagne - BP 601 - 10901 TROYES Cedex 9
> Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
> Tel +33 (0) 825 000 720
> Tel. direct +33 (0) 1 77 35 59 14
> Tel. portable +33 (0) 6 22 42 63 80
> Email ybz at 720.fr
> ……………………………………………………………………………….....www.720.fr
>
>
--
Youssef BENGELLOUN-ZAHR ………………………………………………
Ingénieur Réseaux et Télécoms
Technopole de l'Aube en Champagne - BP 601 - 10901 TROYES Cedex 9
Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
Tel +33 (0) 825 000 720
Tel. direct +33 (0) 1 77 35 59 14
Tel. portable +33 (0) 6 22 42 63 80
Email ybz at 720.fr
……………………………………………………………………………….....www.720.fr
More information about the cisco-nsp
mailing list