[c-nsp] ouch 7204vxr reloaded

Tony td_miles at yahoo.com
Thu Apr 29 00:22:36 EDT 2010


What's really strange is that we had a sup720 failover to the redundant Sup in a 7609 recently. I opened a TAC case and the reason I got for it from Cisco was:

=========
the device experienced a CPU parity error
...
These occur when an energy level within the chip (for example, a one or a zero) changes - most often the result of cosmic radiation.  
=========

The "resolution" was to monitor it for 48hrs and if it didn't happen again it was a once off cause by cosmic radiation and nothing they could do about it. The error didn't recur and so case was closed.



regards,
Tony.


--- On Wed, 28/4/10, eNinja <eninja at gmail.com> wrote:

> From: eNinja <eninja at gmail.com>
> Subject: Re: [c-nsp] ouch 7204vxr reloaded
> To: "Mike" <mike-cisconsplist at tiedyenetworks.com>
> Cc: "Cisco-nsp" <cisco-nsp at puck.nether.net>
> Received: Wednesday, 28 April, 2010, 11:53 PM
> Mike,
> 
> This is a PMPE and as such, the tracebacks et al are
> invalid and so too the decodes.
> 
> There is _no_ software fix to prevent PMPEs.
> 
> Most PMPEs are caused by cosmic radiation and sometimes
> (albeit rarely) from built up ESD due to improper personnel
> handling of components. Since lightning doesn't strike the
> same spot twice, chances are there won't be a recurrence on
> the same processor.
> 
> Monitor and replace the NPE with mem if this recurs within
> the next 6 months otherwise it was a transient issue and no
> further action is required.
> 
> ITMT, ensure safe ESD compliance when handling components.
> 
> Eninja
> 
> 
> 
> On Apr 27, 2010, at 11:53 PM, Mike <mike-cisconsplist at tiedyenetworks.com>
> wrote:
> 
> > Howdy,
> > 
> >   Well that was fun, I discovered that
> my trusty 7204vxr reloaded unexpectedly and I find myself
> without a good explanation. Show version gives me 'processor
> memory pairity error':
> > 
> > System returned to ROM by processor memory parity
> error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr
> 27 2010
> > System restarted at 04:10:28 PDT Tue Apr 27 2010
> > System image file is "disk0:c7200-is-mz.123-26.bin"
> > 
> >   and digging thru the 'show tech' gave
> me:
> > 
> > Pid 3: Process "OSPF Hello" stack 0x63D733F0 savedsp
> 0x63D75660
> > Flags: analyze on_old_queue
> > Status     0x00000000
> Orig_ra   0x00000000 Routine   
> 0x00000000 Signal 0
> > Caller_pc  0x60CDAE84 Callee_pc 0x60806190
> Dbg_events 0x00000000 State  1
> > Totmalloc  548640 
>    Totfree   441612 
>    Totgetbuf  15876 
> Totretbuf  0         
> Edisms    0x60CD71A4 Eparm 0x64A91E7C
> > Elapsed   
> 0xC6E634   Ncalls   
> 0xE09B9AA  Ngiveups 0x491E   Priority_q
> 3          Ticks_5s  2 
>        
> Cpu_5sec   81   
>    Cpu_1min 40
> > Cpu_5min   8       
>   Stacksize 0x2328     Lowstack
> 0x2328   Ttyptr 
>    0x63D55FA8 Mem_holding 0x0   
>     Thrash_count 0
> > Wakeup_reasons      0x0FFFFFFF 
> Default_wakeup_reasons 0x0FFFFFFF
> > Direct_wakeup_major 0x00000000 
> Direct_wakeup_minor 0x00000000
> > 
> > 
> >   So my inexperienced glancing would
> say it was something to do with OSPF. My question tho is,
> #1, how do I really debug a problem like this, and #2, what
> would the minimum cisco contract be required to make sure I
> have access to the cco/bug advisor and possibly updated IOS
> for this device? Its been a tank with absolutely zero issues
> in this enviorment for more than a year, but this event
> underscores the fact that we have no real support route and
> probabbly should get on some program even for our little
> operation.
> > 
> > 
> > Thanks.
> > 
> > Mike-



      




More information about the cisco-nsp mailing list