[c-nsp] ouch 7204vxr reloaded
Tony
td_miles at yahoo.com
Thu Apr 29 00:22:36 EDT 2010
What's really strange is that we had a sup720 failover to the redundant Sup in a 7609 recently. I opened a TAC case and the reason I got for it from Cisco was:
=========
the device experienced a CPU parity error
...
These occur when an energy level within the chip (for example, a one or a zero) changes - most often the result of cosmic radiation.
=========
The "resolution" was to monitor it for 48hrs and if it didn't happen again it was a once off cause by cosmic radiation and nothing they could do about it. The error didn't recur and so case was closed.
regards,
Tony.
--- On Wed, 28/4/10, eNinja <eninja at gmail.com> wrote:
> From: eNinja <eninja at gmail.com>
> Subject: Re: [c-nsp] ouch 7204vxr reloaded
> To: "Mike" <mike-cisconsplist at tiedyenetworks.com>
> Cc: "Cisco-nsp" <cisco-nsp at puck.nether.net>
> Received: Wednesday, 28 April, 2010, 11:53 PM
> Mike,
>
> This is a PMPE and as such, the tracebacks et al are
> invalid and so too the decodes.
>
> There is _no_ software fix to prevent PMPEs.
>
> Most PMPEs are caused by cosmic radiation and sometimes
> (albeit rarely) from built up ESD due to improper personnel
> handling of components. Since lightning doesn't strike the
> same spot twice, chances are there won't be a recurrence on
> the same processor.
>
> Monitor and replace the NPE with mem if this recurs within
> the next 6 months otherwise it was a transient issue and no
> further action is required.
>
> ITMT, ensure safe ESD compliance when handling components.
>
> Eninja
>
>
>
> On Apr 27, 2010, at 11:53 PM, Mike <mike-cisconsplist at tiedyenetworks.com>
> wrote:
>
> > Howdy,
> >
> > Well that was fun, I discovered that
> my trusty 7204vxr reloaded unexpectedly and I find myself
> without a good explanation. Show version gives me 'processor
> memory pairity error':
> >
> > System returned to ROM by processor memory parity
> error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr
> 27 2010
> > System restarted at 04:10:28 PDT Tue Apr 27 2010
> > System image file is "disk0:c7200-is-mz.123-26.bin"
> >
> > and digging thru the 'show tech' gave
> me:
> >
> > Pid 3: Process "OSPF Hello" stack 0x63D733F0 savedsp
> 0x63D75660
> > Flags: analyze on_old_queue
> > Status 0x00000000
> Orig_ra 0x00000000 Routine
> 0x00000000 Signal 0
> > Caller_pc 0x60CDAE84 Callee_pc 0x60806190
> Dbg_events 0x00000000 State 1
> > Totmalloc 548640
> Totfree 441612
> Totgetbuf 15876
> Totretbuf 0
> Edisms 0x60CD71A4 Eparm 0x64A91E7C
> > Elapsed
> 0xC6E634 Ncalls
> 0xE09B9AA Ngiveups 0x491E Priority_q
> 3 Ticks_5s 2
>
> Cpu_5sec 81
> Cpu_1min 40
> > Cpu_5min 8
> Stacksize 0x2328 Lowstack
> 0x2328 Ttyptr
> 0x63D55FA8 Mem_holding 0x0
> Thrash_count 0
> > Wakeup_reasons 0x0FFFFFFF
> Default_wakeup_reasons 0x0FFFFFFF
> > Direct_wakeup_major 0x00000000
> Direct_wakeup_minor 0x00000000
> >
> >
> > So my inexperienced glancing would
> say it was something to do with OSPF. My question tho is,
> #1, how do I really debug a problem like this, and #2, what
> would the minimum cisco contract be required to make sure I
> have access to the cco/bug advisor and possibly updated IOS
> for this device? Its been a tank with absolutely zero issues
> in this enviorment for more than a year, but this event
> underscores the fact that we have no real support route and
> probabbly should get on some program even for our little
> operation.
> >
> >
> > Thanks.
> >
> > Mike-
More information about the cisco-nsp
mailing list