[c-nsp] cat6k/sup720-3b/IOS 12.2(18)SXF8 crash

Mark Zipp mark.r.zipp at gmail.com
Sat Aug 4 20:10:57 EDT 2007


On 04/08/07, Dale Shaw <dale.shaw+cisco-nsp at gmail.com> wrote:
> Hi all,
>
> Had a cat6504 sup720-3b/IOS 12.2(18)SXF8 crash on me yesterday. It's
> one of four recently installed switches providing 10G connectivity in
> a MAN.
>
> The hardware config is:
>  slot1: SUP720-3B
>  slot2: WS-X6708-10G-3C
>  slot3: WS-X6148A-GE-TX (soon to be replaced with WS-X6724-SFP)
>
> IOS is 12.2(18)SXF8, IP Services
> (s72033-ipservicesk9_wan-mz.122-18.SXF8.bin)
>
> It apparently just fell over early yesterday morning. Initial suspects
> were power failure, but it has redundant PSUs and its redundant buddy
> sitting right underneath it in the same rack (powered from the same
> PDUs) was a-OK.
>
> The on-site tech found it sitting in ROMMON mode and apparently just
> re-seated the line cards to get it to boot up again.
>

We had a similar issue with one of our 6504s about a year ago -
similar setup to yours - SUP720-3bxl, 12.2(18)SXF, 48 port GigE card.
I logged a TAC case about it, unfortunately we never really got to the
bottom of the actual cause, and it hasn't happened again.

The likely cause seemed to be that the 6504, prior to us receiving it,
had been converted from CatOS/IOS to IOS/IOS, and that the procedure
hadn't fully been followed. That seemed to have left mismatched
configuration registers on the various components of the switch that
run IOS. When I went looking at those values, after using the "remote"
command (IIRC), one of them was still set to boot into rommon, rather
than load IOS. I made sure they were all set to 0x2102, performed a
couple of test reboots, to ensure it came up again, and that was it -
it's been in production and running ever since.


HTH,
Mark.

> I don't have easy physical access to the switch, but this is the
> information that's been supplied. I tried looking in Bug Navigator but
> it appears to be down at the moment - "Error occurred while fetching
> bug summary from database. Please try later."
>
> Does it ring any bells for anyone on the list?
>
> cheers,
> Dale
>
> 8<========8<========8<========8<========
> prevents returning to ROMMON when break is issued.
>
> *Aug  3 06:35:08: %C6KPWR-SP-4-PSOK: power supply 1 turned on.
> *Aug  3 06:35:08: %C6KPWR-SP-4-PSOK: power supply 2 turned on.
> *Aug  3 06:35:08: %C6KPWR-SP-4-PSREDUNDANTBOTHSUPPLY: in
> power-redundancy mode, system is operating on both power supplies.
> *Aug  3 06:35:11: %C6KENV-SP-4-FANHPMODE: Fan-tray 1 is operating in
> high power mode
>
> %Software-forced reload
>
>
> Breakpoint exception, CPU signal 23, PC = 0x41D61704
>
>
> -Traceback= 41D61704 41D5F690 41B12800 41B1282C 419E6D64 41A27BC4
> 41A1DF94 41A1DFEC 40753778 40754460 418F595C 418F5778 41AB4148
> 41AB106C 41AB1288 41D53DFC
> $0 : 00000000, AT : 430B0000, v0 : 44AA0000, v1 : 43600000
> a0 : 51E1AFB4, a1 : 0000F100, a2 : 00000000, a3 : 42DE0000
> t0 : 41D54418, t1 : 3400F101, t2 : 41D54428, t3 : FFFF00FF
> t4 : 41D54418, t5 : 00000000, t6 : 00000000, t7 : 00000000
> s0 : 00000000, s1 : 43060000, s2 : 5032FF10, s3 : 087406C8
> s4 : 44F7F770, s5 : 51CEE4D8, s6 : 00000040, s7 : 43840000
> t8 : 44ACBD3C, t9 : 00000009, k0 : 00000000, k1 : 00000000
> gp : 430B6A00, sp : 44ACBDE8, s8 : 00000000, ra : 41D5F690
> EPC  : 41D61704, ErrorEPC : 41AB6F88, SREG     : 3400F103
> MDLO : 00000000, MDHI     : 00000000, BadVaddr : 00000000
> Cause 00000024 (Code 0x9): Breakpoint exception
>
> Writing crashinfo to bootflash:crashinfo_20070803-063540
>
> === Flushing messages (16:35:40 AEST Fri Aug 3 2007) ===
>
> Buffered messages: (last 4096 bytes only)
> rrently running ROMMON from F1 region
> *Aug  3 16:35:03: %SYS-6-CLOCKUPDATE: System clock has been updated
> from 06:35:03 UTC Fri Aug 3 2007 to 16:35:03 AEST Fri Aug 3 2007,
> configured from console by console.
> *Aug  3 16:35:03: %SYS-6-CLOCKUPDATE: System clock has been updated
> from 16:35:03 AEST Fri Aug 3 2007 to 16:35:03 AEST Fri Aug 3 2007,
> configured from console by console.
> *Aug  3 16:35:03: %SPANTREE-5-EXTENDED_SYSID: Extended SysId enabled
> for type vlan
> *Aug  3 16:35:03: %LINK-3-UPDOWN: Interface TenGigabitEthernet2/8,
> changed state to down
> *Aug  3 16:35:03: %LINEPROTO-5-UPDOWN: Line protocol on Interface
> TenGigabitEthernet2/8, changed state to down
> *Aug  3 16:35:03: %LINK-5-CHANGED: Interface GigabitEthernet3/1,
> changed state to administratively down
> *Aug  3 16:35:03: %LINEPROTO-5-UPDOWN: Line protocol on Interface
> GigabitEthernet3/1, changed state to down
> *Aug  3 16:35:03: %LINK-3-UPDOWN: Interface GigabitEthernet3/2,
> changed state to down
> *Aug  3 16:35:03: %LINEPROTO-5-UPDOWN: Line protocol on Interface
> GigabitEthernet3/2, changed state to down
> *Aug  3 16:35:03: %LINK-3-UPDOWN: Interface GigabitEthernet3/13,
> changed state to down
> *Aug  3 16:35:03: %LINEPROTO-5-UPDOWN: Line protocol on Interface
> GigabitEthernet3/13, changed state to down
> *Aug  3 16:35:05: %SYS-5-CONFIG_I: Configured from memory by console
> *Aug  3 16:35:05: %LINK-3-UPDOWN: Interface Vlan10, changed state to down
> *Aug  3 16:35:05: %LINK-3-UPDOWN: Interface Vlan3000, changed state to down
> *Aug  3 16:35:06: %LINK-3-UPDOWN: Interface Vlan3032, changed state to down
> *Aug  3 16:35:06: %LINK-3-UPDOWN: Interface Vlan3040, changed state to down
> *Aug  3 16:35:06: %LINK-3-UPDOWN: Interface Vlan3062, changed state to down
> *Aug  3 16:35:07: %SYS-5-RESTART: System restarted --
> Cisco Internetwork Operating System Software
> IOS (tm) s72033_rp Software (s72033_rp-ADVIPSERVICESK9_WAN-M), Version
> 12.2(18)SXF8, RELEASE SOFTWARE (fc2)
> Technical Support 3 16:35:42: %SYS-SP-3-LOGGER_FLUSHING: System
> pausing to ensure console debugging output.
>
> *Aug  3 16:35:42: %SYS-SP-3-LOGGER_FLUSHED: System was paused for
> 00:00:00 to ensure console debugging output.
>
> *Aug  3 16:35:42: %SYS-SP-2-INTSCHED: 'sleep for' at level 7
> -Process= "SCP Online Process", ipl= 7, pid= 250
> -Traceback= 402D65C0 402BEDAC 4044A25C 407F7E84 407F7E3C 40506D24
> 40509130 402BB3F0 402DD754 402DCBA4 402D0368 402CE0E4
> *Aug  3 16:35:42: %SYS-SP-2-INTSCHED: 'sleep for' at level 7
> -Process= "SCP Online Process", ipl= 7, pid= 250
> -Traceback= 402D65C0 402BEDAC 4044A2B8 407F7E84 407F7E3C 40506D24
> 40509130 402BB3F0 402DD754 402DCBA4 402D0368 402CE0E4
> *Aug  3 16:35:42: %OIR-SP-6-CONSOLE: Changing console ownership to
> switch processor
>
>
>
> *** System received a Software forced crash ***
> signal= 0x17, code= 0x24, context= 0x42321d34
> PC = 0x402cfaac, Cause = 0x1020, Status Reg = 0x34008002
> rommon 2 > set
> PS1=rommon ! >
> LOG_PREFIX_VERSION=1
> SLOTCACHE=cards;
> ?=0
> BOOT=bootdisk:s72033-ipservicesk9_wan-mz.122-18.SXF8.bin,12;
> ACL_DENY=0
> PF_REDUN_CRASH_COUNT=2
> BSI=0
> CRASHINFO=bootflash:crashinfo_20070803-063540
> RET_2_RTS=16:35:42 AEST Fri Aug 3 2007
> RET_2_RCALTS=1186122943
> rommon 3 > confreg
>
>
>     Configuration Summary
> enabled are:
> break/abort has effect
> console baud: 9600
> boot: the ROM Monitor
>
> do you wish to change the configuration? y/n  [n]:  n
>
>
> rommon 4 >
> 8<========8<========8<========8<========
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>


More information about the cisco-nsp mailing list