[j-nsp] M20 SSB slot 0 failures

Jonas Frey (Probe Networks) jf at probe-networks.de
Tue Jan 18 05:15:41 EST 2011


Hi Chris,

i havent seen an error like this where the same SSB works fine in slot 1
but not slot 0.

But my guess is that slot 0 gives back the true status of the card and
the test report from slot 1 is inaccurate.

We have seen memory failures of SSB-E(-16) boards a couple of times
while running in production. It appears the memory of the boards wears
over time and then starts spitting out errors. This works for some time
since its ECC memory but all things come to an end.
Just go and grab new memory and try again. Its easy to replace and
replacement memory (tho unofficial) is pretty cheap.
See 
http://juniper.cluepon.net/Unofficial_hardware_upgrades

Regards,
Jonas

> 
> Hi,
> 
> I have four M20 chassis with continuous slot 0 SSB failures. 
> 
> These are from two completely different vendors..
> 
> I would think, oh, a bad chassis, but I am getting this same result with a variety of chassis and SSB cards.  I do have chassis that don't display this failure, with the same SSB cards.  This is what leads me to believe that I am hitting a rash of bad crap.
> 
> The failure is as follows.  Any SSB tests out fine in slot 1.  But in slot 0, the same SSBs fail.  Slot 0 often "Fails over" to slot 1 in operation if both SSBs are populated in these chassis.
> 
> Is this some kind of known problem?  Or am I just the most unlucky person in the Juniper M20 world?
> 
> Success in slot 1
> -----------------
> 
> SSB1( vty)# bringup chassis slot-state 1 diag
> Slot 1 state changed from 'on-line' to 'diagnostics'
> 
> SSB1( vty)# diagnostic set mode manufacturing
> 
> SSB1( vty)# diag clear log
> 
> SSB1( vty)# diag bchip 1 sdram
> [Waiting for completion, a:abort, p:pause]
> B SDRAM (Slot 1) test
> phase 1, pass 1, B SDRAM (Slot 1) test: Address Test
> phase 2, pass 1, B SDRAM (Slot 1) test: Pattern Test
> phase 3, pass 1, B SDRAM (Slot 1) test: Walking 0 Test
> phase 4, pass 1, B SDRAM (Slot 1) test: Walking 1 Test
> phase 5, pass 1, B SDRAM (Slot 1) test: Mem Clear Test
> B SDRAM (Slot 1) test completed, 1 pass,  0 errors
> 
> 
> SSB1( vty)# diag bchip 1 sdram
> [Waiting for completion, a:abort, p:pause]
> B SDRAM (Slot 1) test
> phase 1, pass 1, B SDRAM (Slot 1) test: Address Test
> phase 2, pass 1, B SDRAM (Slot 1) test: Pattern Test
> phase 3, pass 1, B SDRAM (Slot 1) test: Walking 0 Test
> phase 4, pass 1, B SDRAM (Slot 1) test: Walking 1 Test
> phase 5, pass 1, B SDRAM (Slot 1) test: Mem Clear Test
> B SDRAM (Slot 1) test completed, 1 pass,  0 errors
> 
> 
> Fail in slot 0
> --------------
> 
> SSB0( vty)# bringup chassis slot-state 0 diag
> Slot 0 state changed from 'diagnostics' to 'diagnostics'
> 
> SSB0( vty)# diagnostic set mode manufacturing
> 
> SSB0( vty)# diag clear log
> 
> SSB0( vty)# diag bchip 0 sdram 
> [Waiting for completion, a:abort, p:pause]
> B SDRAM (Slot 0) test
> phase 1, pass 1, B SDRAM (Slot 0) test: Address Test
> 
> *** Fatal error during B SDRAM (Slot 0) test, pass 1,
> Data did not compare, Slot 0 (NIC0 B chip SDRAM banks ref. des. U?)
> 
> 
> B SDRAM (Slot 0) test completed, 1 pass,  1 error
> 
> [Jan  5 21:34:17.356 LOG: Err] Data Error: Bank 0 (global cell 0x3e52): Expected 0x5280001f, Observed 0x200200
> 
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <https://puck.nether.net/pipermail/juniper-nsp/attachments/20110118/b8939e0d/attachment.pgp>


More information about the juniper-nsp mailing list