[c-nsp] Cisco VIP2-50+PA-2FE-TX second ethernet port bug

Gert Doering gert at greenie.muc.de
Fri Apr 8 10:11:16 EDT 2005


Hi,

On Fri, Apr 08, 2005 at 09:10:16AM -0400, Rodney Dunn wrote:
> I think I asked about this before and nobody resonded.

Sorry, it's still sitting in my TODO list.  Unfortunately, I've lost the
crash dumps, so I couldn't send them to you...

> What is happening?
> Vague comments don't help me.

What we see is:

  - 7507, RSP4
  - VIP2-50 in VIP slot 0  (2 different VIP2's tried, different VIP slots tried)
  - PA-2FE-TX  in PA bay 0 (2 different PA-2FEs tried, bay 1 is empty)

FastE0/0/0 runs perfectly smooth.

I can enable FastE0/0/1, but as soon as there is noticeable traffic,
like a full BGP session coming up, or a "ping / sweep range of sizes",
the VIP2-50 crashes.

IOS is 12.0(27)S2  

(but I *think* the crashes happened with 12.0(25)S or 12.0(26)S).


If the crash happens, what ends up in syslog (*this* I have saved :) )
is the following:

------------------- snip ------------------
Jan 26 18:17:14 c7500 94: %SYS-5-CONFIG_I: Configured from console by gert on console
Jan 26 18:17:48 c7500 100: %VIP2-3-MSG: slot0 VIP-3-PCI_BUS0_SYSERROR: PCI bus 0 system error.
Jan 26 18:17:48 c7500 101: %VIP2-1-MSG: slot0 PMA error register = 0082381800000000
Jan 26 18:17:48 c7500 102: %VIP2-1-MSG: slot0     PCI master address = 0823818
Jan 26 18:17:48 c7500 103: %VIP2-1-MSG: slot0 PA Bay 0 Upstream PCI-PCI Bridge, Handle=0
Jan 26 18:17:48 c7500 104: %VIP2-1-MSG: slot0 DEC21050 bridge chip, config=0x0
Jan 26 18:17:48 c7500 105: %VIP2-1-MSG: slot0 (0x00):dev, vendor id       = 0x00011011
Jan 26 18:17:48 c7500 106: %VIP2-1-MSG: slot0 (0x04):status, command      = 0x42800147
Jan 26 18:17:48 c7500 107: %VIP2-1-MSG: slot0          Signaled System Error  on primary bus
Jan 26 18:17:48 c7500 108: %VIP2-1-MSG: slot0 (0x08):class code, revid    = 0x06040002
Jan 26 18:17:48 c7500 109: %VIP2-1-MSG: slot0 (0x0C):hdr, lat timer, cls  = 0x00010000
Jan 26 18:17:48 c7500 110: %VIP2-1-MSG: slot0 (0x18):sec lat,cls & bus no = 0x00010100
Jan 26 18:17:48 c7500 111: %VIP2-1-MSG: slot0 (0x1C):sec status, io base  = 0x82807020
Jan 26 18:17:48 c7500 112: %VIP2-1-MSG: slot0          Detected Parity Error  on secondary bus
Jan 26 18:17:48 c7500 113: %VIP2-1-MSG: slot0 (0x20):mem base & limit     = 0x01F00000
Jan 26 18:17:48 c7500 114: %VIP2-1-MSG: slot0 (0x24):prefetch membase/lim = 0x0000FE00
Jan 26 18:17:48 c7500 115: %VIP2-1-MSG: slot0 (0x3C):bridge ctrl          = 0x00030000
Jan 26 18:17:48 c7500 116: %VIP2-1-MSG: slot0 (0x40):arb/serr, chip ctrl  = 0x00100000
Jan 26 18:17:48 c7500 117: %VIP2-1-MSG: slot0 (0x44):pri/sec trgt wait t. = 0x00000000
Jan 26 18:17:49 c7500 118: %VIP2-1-MSG: slot0 (0x48):sec write attmp ctr  = 0x00FFFFFF
Jan 26 18:17:49 c7500 119: %VIP2-1-MSG: slot0 (0x4C):pri write attmp ctr  = 0x00FFFFFF
Jan 26 18:18:02 c7500 120: %VIP2-3-MSG: slot0 VIP-3-SVIP_RELOAD: SVIP Reload is called. 
Jan 26 18:18:02 c7500 121: %VIP2-3-MSG: slot0 VIP-3-SYSTEM_EXCEPTION: VIP System Exception occurred sig=22, code=0x0, context=0x60A95688
Jan 26 18:18:02 c7500 122:  
Jan 26 18:18:03 c7500 123: %DBUS-3-DBUSINTERRSWSET: Slot 0, Internal Error due to VIP crash
Jan 26 18:18:31 c7500 124: %SYS-3-CPUHOG: Task ran for 9420 msec (59/14), process = OIR Handler, PC = 4043FA88.
Jan 26 18:18:31 c7500 125: -Traceback= 4043FA90
------------------- snip ------------------

it certainly looks like a "defective PA or VIP2", but as I said, we've
already re-seated the PA, then swapped VIP2 and PA, and the new PA-2FE was 
tested with a large "sweep range" ping in a 7200 (both ports, with no ill 
effects).

> The only think I know about is:
> 
> CSCsa50332
> Externally found moderate defect: Assigned (A)
> VIP4-80 with PA-2FE-TX may crash with parity error

I'm not sure what the exact difference between the VIP2-50 and VIP4-80
is (the "performance PDF" claims same performance numbers, so maybe 
those aren't *that* different), but it looks like it...

> I've seen this with one customer and we've tried
> very hard to recreate this in the lab and have
> not been able to do it yet.

I had reported my problems on this list (cisco-nsp), and some other
writers told me "we have not seen the problem, our PA-2FE works fine
in a VIP2-50".  So it's working for some, and not for others :(

My hardware details:

------------------- snip ----------------------
Slot 0:
	Physical slot 0, ~physical slot 0xF, logical slot 0, CBus 0
	Microcode Status 0x4
	Master Enable, LED, WCS Loaded
	Board is analyzed 
	Pending I/O Status: None
	EEPROM format version 1
	VIP2 R5K controller, FRU: VIP2-50=, HW rev 2.03, board revision A0
	Serial number: 18952807  Part number: 73-2167-06
	Test history: 0x00        RMA number: 00-00-00
	Flags: cisco 7000 board; 7500 compatible

	EEPROM contents (hex):
	  0x20: 01 1E 02 03 01 21 32 67 49 08 77 06 00 00 00 00
	  0x30: 50 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00

	Slot database information:
	Flags: 0x4	Insertion time: 0x3EC8 (34w2d ago)

	Controller Memory Size: 128 MBytes DRAM, 4096 KBytes SRAM

	PA Bay 0 Information:
		Dual Port Fast Ethernet (RJ45), 2 ports, FRU: PA-2FE-TX=
                EEPROM format version 4
		HW rev 1.00, Board revision B0
		Serial number: JAE064006VR  Part number: 73-5419-06 

------------------- snip ----------------------

PA Bay 1 is empty.

gert
-- 
USENET is *not* the non-clickable part of WWW!
                                                           //www.muc.de/~gert/
Gert Doering - Munich, Germany                             gert at greenie.muc.de
fax: +49-89-35655025                        gert at net.informatik.tu-muenchen.de


More information about the cisco-nsp mailing list