[j-nsp] diagnosing the kablooys

Richard A Steenbergen ras at e-gerbil.net
Tue Dec 7 01:16:56 EST 2004


Aside from the usual "call JTAC" answer, does anyone have any idea what 
specifically went bad here:

Dec  6 23:54:05  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (321, Unicast) (timeout) 
Dec  6 23:54:05  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (323, Unicast) (generic failure) 
Dec  6 23:54:05  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (328, Unicast) (timeout) 
Dec  6 23:54:05  router fpc1 DXO: Plane 1, link CRC error (0x0f) 
Dec  6 23:54:05  router fpc1 DXO: Plane 3, link CRC error (0x0f) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (334, Unicast) (timeout) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (335, Unicast) (timeout) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (341, Unicast) (timeout) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (346, Unicast) (timeout) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (351, Unicast) (timeout) 
Dec  6 23:54:06  router fpc2 GE(2/0): Kchip 0 crc/fifo errors 
Dec  6 23:54:06  router fpc2 GE(2/1): Kchip 0 crc/fifo errors 
Dec  6 23:54:06  router fpc2 GE(2/2): Kchip 0 crc/fifo errors 
Dec  6 23:54:06  router fpc2 GE(2/3): Kchip 0 crc/fifo errors 
Dec  6 23:54:06  router fpc2 DXO: Plane 1, link CRC error (0x0f) 
Dec  6 23:54:06  router fpc3 DCHIP(3/1): BD link(0) CRC error 
Dec  6 23:54:06  router fpc3 DCHIP(3/1): BD link(1) CRC error 
Dec  6 23:54:06  router fpc3 DCHIP(3/1): BD link(2) CRC error 
Dec  6 23:54:06  router fpc3 DCHIP(3/1): BD link(3) CRC error 
Dec  6 23:54:06  router fpc3 DCHIP(3/2): BD link(0) CRC error 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (363, Unicast) (timeout) 
Dec  6 23:54:06  router fpc2 DXO: Plane 3, link CRC error (0x0f) 
Dec  6 23:54:06  router fpc3 DCHIP(3/2): BD link(1) CRC error 
Dec  6 23:54:06  router fpc3 DCHIP(3/2): BD link(2) CRC error 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (409, Unicast) (timeout) 
Dec  6 23:54:06  router fpc3 DCHIP(3/2): BD link(3) CRC error 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (452, Unicast) (timeout) 
Dec  6 23:54:06  router fpc3 DXO: Plane 1, link CRC error (0x0f) 
Dec  6 23:54:06  router fpc3 DXO: Plane 3, link CRC error (0x0f) 
Dec  6 23:54:06  router fpc1 DXO: Plane 3, link CRC error (0x0f) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (455, Unicast) (timeout) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (456, Unicast) (timeout) 
Dec  6 23:54:06  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (462, Unicast) (timeout) 
Dec  6 23:54:07  router sfm1 NH(nh_ucast_poll_stats): Failed to fetch stats for nh (464, Unicast) (timeout) 


Basically followed by a long and continuous string of:

Dec  7 00:00:13  router fpc2 DXI: PIC 1, link CRC error (0x01) 
Dec  7 00:00:13  router fpc3 DCHIP(3/2): BD link(1) CRC error 
Dec  7 00:00:13  router fpc2 DXI: PIC 2, link CRC error (0x01) 
Dec  7 00:00:13  router fpc3 DCHIP(3/2): BD link(2) CRC error 
Dec  7 00:00:13  router fpc2 DXI: PIC 3, link CRC error (0x01) 
Dec  7 00:00:13  router fpc3 DCHIP(3/2): BD link(3) CRC error 
Dec  7 00:00:13  router fpc3 DXI: PIC 1, link CRC error (0x0f) 
Dec  7 00:00:13  router fpc3 DXI: PIC 2, link CRC error (0x0f) 
Dec  7 00:00:14  router fpc2 GE(2/0): Kchip 0 crc/fifo errors 
Dec  7 00:00:14  router fpc2 GE(2/1): Kchip 0 crc/fifo errors 
Dec  7 00:00:14  router fpc2 GE(2/2): Kchip 0 crc/fifo errors 
Dec  7 00:00:14  router fpc2 GE(2/3): Kchip 0 crc/fifo errors 
Dec  7 00:00:14  router fpc2 DXI: PIC 0, link CRC error (0x01) 
Dec  7 00:00:14  router fpc3 DCHIP(3/1): BD link(0) CRC error 
Dec  7 00:00:14  router fpc3 DCHIP(3/1): BD link(1) CRC error 
Dec  7 00:00:14  router fpc3 DCHIP(3/1): BD link(2) CRC error 
Dec  7 00:00:14  router fpc3 DCHIP(3/1): BD link(3) CRC error 
Dec  7 00:00:14  router fpc3 DCHIP(3/2): BD link(0) CRC error 
Dec  7 00:00:14  router fpc2 DXI: PIC 1, link CRC error (0x01) 
Dec  7 00:00:14  router fpc3 DCHIP(3/2): BD link(1) CRC error 
Dec  7 00:00:14  router fpc2 DXI: PIC 2, link CRC error (0x01) 
Dec  7 00:00:14  router fpc3 DCHIP(3/2): BD link(2) CRC error 
Dec  7 00:00:14  router fpc2 DXI: PIC 3, link CRC error (0x01) 
Dec  7 00:00:14  router fpc3 DCHIP(3/2): BD link(3) CRC error 
Dec  7 00:00:14  router fpc3 DXI: PIC 1, link CRC error (0x0f) 
Dec  7 00:00:14  router fpc3 DXI: PIC 2, link CRC error (0x0f) 
Dec  7 00:00:15  router fpc2 GE(2/0): Kchip 0 crc/fifo errors 
Dec  7 00:00:15  router fpc2 GE(2/1): Kchip 0 crc/fifo errors 
Dec  7 00:00:15  router fpc2 GE(2/2): Kchip 0 crc/fifo errors 
Dec  7 00:00:15  router fpc2 GE(2/3): Kchip 0 crc/fifo errors 
Dec  7 00:00:15  router fpc2 DXI: PIC 0, link CRC error (0x01) 
Dec  7 00:00:15  router fpc3 DCHIP(3/1): BD link(0) CRC error 
Dec  7 00:00:15  router fpc3 DCHIP(3/1): BD link(1) CRC error 
Dec  7 00:00:15  router fpc3 DCHIP(3/1): BD link(2) CRC error 
Dec  7 00:00:15  router fpc3 DCHIP(3/1): BD link(3) CRC error 
Dec  7 00:00:15  router fpc3 DCHIP(3/2): BD link(0) CRC error 
Dec  7 00:00:15  router fpc2 DXI: PIC 1, link CRC error (0x01) 
Dec  7 00:00:15  router fpc3 DCHIP(3/2): BD link(1) CRC error 
Dec  7 00:00:15  router fpc2 DXI: PIC 2, link CRC error (0x01) 
Dec  7 00:00:15  router fpc3 DCHIP(3/2): BD link(2) CRC error 
Dec  7 00:00:15  router fpc2 DXI: PIC 3, link CRC error (0x01) 
Dec  7 00:00:15  router fpc3 DCHIP(3/2): BD link(3) CRC error 
Dec  7 00:00:15  router fpc3 DXI: PIC 1, link CRC error (0x0f) 
Dec  7 00:00:15  router fpc3 DXI: PIC 2, link CRC error (0x0f) 
Dec  7 00:00:16  router fpc2 GE(2/0): Kchip 0 crc/fifo errors 
Dec  7 00:00:16  router fpc2 GE(2/1): Kchip 0 crc/fifo errors 
Dec  7 00:00:16  router fpc2 GE(2/2): Kchip 0 crc/fifo errors 
Dec  7 00:00:16  router fpc2 GE(2/3): Kchip 0 crc/fifo errors 
Dec  7 00:00:16  router fpc2 DXI: PIC 0, link CRC error (0x01) 

until power cycle. Physical interfaces/link stayed up, but router failed 
to respond externally and didn't return keepalives. I'm going to go with 
the obvious and guess that sfm1 went wonky and started corrupting data 
coming from the FPCs, but its always nice to get a second opinion from 
Juniper folks whenever the "guess the cause of the failure of the ASIC 
named after a letter" game starts. :)

-- 
Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)


More information about the juniper-nsp mailing list