[c-nsp] 7600 sup7203bxl error msg
Dale W. Carder
dwcarder at doit.wisc.edu
Thu Nov 9 10:26:59 EST 2006
We too got this error on a SXF4 router last night. Guess you're
not alone :-)
Nov 9 05:25:34: %SYSTEM_CONTROLLER-3-ERROR: Error condition
detected: TM_DATA_PARITY_ERROR
Nov 9 05:25:34: %SYSTEM_CONTROLLER-3-EXCESSIVE_RESET: System
Controller is getting reset so frequently
In our case, it looks like the problem came from the RP instead
of the SP. From "sh ibc" it looks like the number of IBC resets
that occurred was 2, and any other error counter is 0. "sh stack"
says the Mistral Error Interrupt process has only been called once.
Dale
----------------------------------
Dale W. Carder - Network Engineer
University of Wisconsin at Madison
http://net.doit.wisc.edu/~dwcarder
On Nov 8, 2006, at 6:07 PM, Sukumar Subburayan wrote:
> Everytime we get parity error in the system controller, we try to soft
> reset the IBC and recover from the condition. Things should just be
> fine,
> if this was a transient one off case.
>
> However, if the issue is persistant, we will be constantly
> resetting the
> IBC and hence dropping packets. The second syslog message is to
> warn you
> that you are seeing excessive IBC resets.
>
> According to your syslog your SP's system controller is what is
> reporting
> the parity error.
>
> Is the output of 'show ibc' below from the SP-side?
>
> If not, can you get us 'remote command switch show ibc' .
>
> sukumar
>
>
>
> On Wed, 8 Nov 2006, matt carter wrote:
>
>>
>> hey all,
>>
>> hoping someone may have an insight into some errors i have not
>> seen before
>>
>> Nov 7 11:24:46.535 AEST: %SYSTEM_CONTROLLER-SP-3-ERROR: Error
>> condition
>> detected: TM_DATA_PARITY_ERROR
>>
>> Nov 7 11:24:46.535 AEST: %SYSTEM_CONTROLLER-SP-3-EXCESSIVE_RESET:
>> System
>> Controller is getting reset so frequently
>>
>> first one is fair enough
>>
>> Explanation The most common errors from the Mistral ASIC on the
>> supervisor engine are TM_DATA_PARITY_ERROR and
>> TM_NPP_PARITY_ERROR. Possible
>> causes of these parity errors are random static discharge or other
>> external
>> factors.
>> Recommended Action If the error message is only seen once (or
>> rarely),
>> the recommendation is to monitor the switch syslog to confirm the
>> error
>> message was an isolated incident. If these error messages are
>> reoccurring,
>> open a case with the Technical Assistance Center
>>
>> but the construct of the second error seems to suggest this is not a
>> "isolated incident" since my system controller is being reset "so
>> frequently" which kind of makes the log messages somewhat
>> contradictory in a
>> fashion. when i go looking at the SP stack i can only see the
>> mistral error
>> interrupt called once.
>>
>> from show stack
>> Interrupt level stacks:
>> Level Called Unused/Size Name
>> 5 1 7168/9000 Mistral Error Interrupt
>>
>> when i check the ibc stats i can see the mistral hardware was
>> definately
>> reset at the time of the incident, and has
>> been reset 3 times, but there is 0 errors.
>>
>> anyone have ideas on how to proceed in terms of catching this if
>> it happens
>> again?
>>
>> from show ibc
>> Interface information:
>> Interface IBC0/0(idb 0x43090238)
>> Hardware is Mistral IBC (revision 5)
>> 5 minute rx rate 5000 bits/sec, 11 packets/sec
>> 5 minute tx rate 12000 bits/sec, 22 packets/sec
>> 4900820 packets input, 317526199 bytes
>> 4788796 broadcasts received
>> 10253179 packets output, 730435109 bytes
>> 68213 broadcasts sent
>> 0 Packets CEF Switched, 0 Packets Fast Switched
>> 0 Packets SLB Switched, 0 Packets CWAN Switched
>> IBC resets = 3; last at 11:24:46.535 AEST Fri Nov 7 2006
>> MISTRAL ERROR COUNTERS
>> System address timeouts = 0 BUS errors = 0
>> IBC Address timeouts = 0 (addr 0x0)
>> Page CRC errors = 0 IBL CRC errors = 0
>> ECC Correctable errors = 0
>> Packets with padding removed (0/0/0) = 0
>> Packets expanded (0/0) = 0
>> Packets attempted tail end expansion > 1 page and were
>> dropped = 0
>> IP packets dropped with frag offset of 1 = 0
>> 0 packets (aggregate) dropped on throttled interfaces
>> Hazard Illegal packet length = 0 Illegal
>> Offset = 0
>> Hazard Packet underflow = 0 Packet
>> Overflow = 0
>> IBL fill hang count = 0 Unencapsed
>> packets = 0
>> LBIC RXQ Drop pkt count = 0 LBIC drop pkt count
>> = 0
>> LBIC Drop pkt stick = 0
More information about the cisco-nsp
mailing list