[c-nsp] 6500 hybrid - native conversion crashes

Church, Charles cchurch at multimax.com
Sat Apr 14 10:39:02 EDT 2007


Anyone,
 
    I seem to be having a huge amount of problem getting some sup2/msfc2
6500s converted over.  It's in a lab environment, but I'm trying get the
procedures right, as we've got a project to do this on about 200+
devices in the field soon.  Not feeling at all comfortable.  Getting the
same kind of crash on three different 6500s which have all been running
hybrid CatOS 7.6/IOS 12.1E for a long time, no issues.  So I don't think
it's bad hardware, all three being bad would be highly unlikely.
Following the documented Cisco procedure to a 'T', but seeing this:
 
RP - set config register to 0x0 in IOS, exit back to SP
SP - set the config register to 0x0, reset it
SP - Blank-out the CONFIG_FILE variable, sync, reset
SP - 'boot bootflash:c6sup22-ps-mz.121-27b.E1.bin'
IOS boots on the SP, and hands off control to the RP ROMMON.  From here,
if I just let it sit there for a few minutes, it will sometimes crash.
If I attempt to boot the RP using the same file - 'boot
sup-bootflash:c6sup22-ps-mz.121-27b.E1.bin' , it'll crash also.  Typical
crashes look like this:
 
************************************************************************
****************************
00:00:03: %PFREDUN-6-ACTIVE: Initializing as ACTIVE processor
00:00:04: %OIR-SP-6-CONSOLE: Changing console ownership to route
processor
 
System Bootstrap, Version 12.1(11r)E1, RELEASE SOFTWARE (fc1)
TAC Support: http://www.cisco.com/tac
Copyright (c) 2002 by cisco Systems, Inc.
Cat6k-MSFC2 platform with 524288 Kbytes of main memory
 
rommon 1 > set
PS1=rommon ! >
EARL_VER=6
RET_2_RUTC=1118784646
SLOTCACHE=cards 2=42(6E at 3,-);
BSI=0
BOOT=
BOOTLDR=
CRASHINFO=bootflash:crashinfo_20070412-182358
RET_2_RTS=18:23:58 utc Thu Apr 12 2007
RET_2_RCALTS=1176402241
?=0
rommon 2 >
rommon 2 > boot sup-bootflash:c6sup22-ps-mz.121-27b.E1.bin
 
Self decompressing the image :
#################################################
######################### [OK]
 
RP: Currently running ROMMON from S (Gold) region
Attempt to download 'sup-bootflash:c6sup22-ps-mz.121-27b.E1.bin' ...
okay
S
 
 TLB Modification exception, CPU signal 10, PC = 0x40E9F1DC
 

-Traceback= 40E9F1DC 4032CA60 4032CCA0 40263EC4 40263EB0
$0 : 00000000, AT : 419B0000, v0 : 00007F45, v1 : 000002CC
a0 : 40A00000, a1 : 080E4B7C, a2 : 0000059A, a3 : 40A00000
t0 : 00000028, t1 : 3401FD01, t2 : 34018100, t3 : FFFF00FF
t4 : 402975D8, t5 : 004C0000, t6 : 004A0000, t7 : 21250000
s0 : 080E4B70, s1 : 080E4B60, s2 : 000000FF, s3 : 00000004
s4 : 00000002, s5 : 00000003, s6 : 00000151, s7 : 00000000
t8 : 000095D6, t9 : 00000000, k0 : 3041C001, k1 : 30410000
gp : 419B5320, sp : 42974D50, s8 : 00000000, ra : 4032CA60
EPC  : 40E9F1DC, ErrorEPC : BFC27CD4, SREG     : 3401FD03
MDLO : 73C8870F, MDHI     : F9E09330, BadVaddr : 40A00000
CacheErr : F0800000, DErrAddr0 : 0C0E4B7A, DErrAddr1 : 01D238F8
Cause 00000004 (Code 0x1): TLB Modification exception
 
Writing crashinfo to bootflash:crashinfo_20070412-184353
 
=== Flushing messages (18:43:53 utc Thu Apr 12 2007) ===
 
*** System received a Bus Error exception ***
signal= 0xa, code= 0x41ba0000, context= 0x41c1a054
PC = 0x402982d8, Cause Reg = 0x2820, Status Reg = 0x34018002
rommon 3 >
00:03:05: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:00 to
ensure con
sole debugging output.
 
00:03:06: %OIR-SP-6-CONSOLE: Changing console ownership to switch
processor
 
 
*** System received a Software forced crash ***
signal= 0x17, code= 0x42230000, context= 0x42268424
PC = 0x40130b4c, Cause = 0x1820, Status Reg = 0x34018002
rommon 2 >

************************************************************************
******************

Sometimes I only get this crash:

************************************************************************
******************

00:03:05: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:00 to
ensure con
sole debugging output.

00:03:06: %OIR-SP-6-CONSOLE: Changing console ownership to switch
processor


*** System received a Software forced crash ***
signal= 0x17, code= 0x42230000, context= 0x42268424
PC = 0x40130b4c, Cause = 0x1820, Status Reg = 0x34018002
rommon 2 >
 
************************************************************************
*******************

	At first I thought it might be memory or bad sup, but it seems
to do that on all three different 6500s I've got.  One of them has been
upgraded to 512 mb ram on both sup and MSFC, and we were trying to do
12.2(18)SXF6 on it, using a 64 mb card.  The others are 128mb/512mb
sup/MSFC, and I'm only trying 12.1(27b)E1 on them.  The 12.2SX one I
actually got to work eventually by 'unsetting' the CONFIG_FILE variable,
and booting it on 12.1E off the sup-bootflash.  Then I as able to tell
it to boot off the disk0: using the 12.2 image, and autoboot works.  So
I know my file isn't corrupted, no bad flash, or anything like that.
BUT, if I manually try to boot it as if I was doing a hybrid-native
conversion, it'll crash like the others still.  Both SP and RP ROMMONs
have been upgraded to current versions, same results.
	One other thing that strikes me as odd is the 64mb ATA flash
cards which was formatted under CatOS is readable/writable under IOS
when I eventually got native mode to boot on the one.  I didn't think
that was supposed to work like that.  I did eventually format is under
IOS, and re-copied the 12.2SX image.  But it'll still crash if I try to
boot from ROMMON natively.  Anyone see issues like this?  I've got a TAC
case open, and the guy says he's seen a few issues like this, some were
resolved by completely re-formatting the sup-bootflash or disk0:.
Hasn't worked on a couple of mine.  Just wondering if there was a bad
batch of sups, or some weird environment variable that got set somehow
that doesn't like booting via ROMMON.  Figured I bounce this issue off a
few more people.  TAC hasn't given me any official reason yet.  Sorry
for the really long email...

Thanks,

Chuck Church
Multimax Network Engineer, CCIE #8776
EDS Contractor, Multimax - Navy Marine Corps Intranet (NMCI)
1210 N. Parker Rd. | Greenville, SC 29609 
Office: 864-335-9473 | Cell: 864-266-3978
cchurch at multimax.com <mailto:cchurch at multimax.com> 



More information about the cisco-nsp mailing list