[j-nsp] problem with a mx480, appears a FPC bounced, can anyone take a look at these logs?

Matt Yaklin myaklin at g4.net
Thu May 17 09:53:38 EDT 2012


Hi list,

Last night it appears I had a FPC go offline and then come back.

This is a MX480 running 11.2R1.10. Both routing engines look ok, the
master is still the master, and neither rebooted.

The box is doing very light work (BGP/OSPF) and was only installed several
months ago. We are only using a handful of interfaces at the moment and
all of them reside on fpc0. We decided to use the code it shipped with as
we did not plan to do anything fancy right away. This MX480 replaced a
M10.

We only have 2 of the 4 power supplies hooked up at the moment but power
does not seem to be an issue. Both zones have a power supply for it and
have plenty to spare.

I wanted to email the list and get anyone's thoughts on what might have
happened or other places I can look for a clue before I open a ticket with
Juniper.

Log snippets below from messages and chassisd.

Thank you for any help,

matt

> show log messages
May 16 23:42:53  re0 fpc0 BULKGET: Master socket closed
May 16 23:42:53  re0 fpc0 BULKGET disconnected: BULKGET socket closed abruptly
May 16 23:42:53  re0 fpc0 PPMAN disconnected; Remote side closed
May 16 23:42:53  re0 fpc0 Bulkget manager reconnection succeeded after 1 tries
May 16 23:42:53  re0 fpc0 BULKGET master RE reconnection made
May 16 23:42:54  re0 fpc0 CMLC: Master closed connection
May 16 23:42:54  re0 fpc0 CMLC: Going disconnected; Routing engine chassis socket closed abruptly
May 16 23:42:54  re0 chassisd[1398]: CHASSISD_IPC_CONNECTION_DROPPED: Dropped IPC connection for FPC 0
May 16 23:42:54  re0 chassisd[1398]: CHASSISD_IFDEV_DETACH_FPC: ifdev_detach_fpc(0)

> show log chassisd
May 16 23:42:54  rcv: ch_ipc_dispatch() null ipc read for args 0x8d4c400 pipe 0x8d49f00, fru FPC 0
May 16 23:42:54  fpc_disconnect_generic: fpc 0 state Online cargs 0x8d4c400 clean_shutdown 0, offline_reason=None
May 16 23:42:54  -- FPC 0, last request 132, state Online
May 16 23:42:54  -- Temperature 33 degrees C / 91 degrees F
May 16 23:42:54  -- Scratch (0x00)           0x55
May 16 23:42:54  -- Version (0x01)           0x46
May 16 23:42:54  -- Master Status (0x02)     0x07
May 16 23:42:54  -- Mastership Timeout (0x03) 0x3f
May 16 23:42:54  -- Master Force (0x04)      0x00
May 16 23:42:54  -- Interface 0 (0x05)       0x10
May 16 23:42:54  -- Interface 1 (0x06)       0x10
May 16 23:42:54  -- Soft Reset (0x07)        0x00
May 16 23:42:54  -- Error/Interrupt Status (0x10) 0x0b
May 16 23:42:54  -- Interrupt Enable (0x11)  0x00
May 16 23:42:54  -- FRU LED Control (0x12)   0x05
May 16 23:42:54  -- Misc IO Status (0x13)    0x00
May 16 23:42:54  -- Button Status (0x14)     0x03
May 16 23:42:54  -- Button Interrupt Enable (0x15) 0x00
May 16 23:42:54  -- GPIO Output Enable (0x16) 0x00
May 16 23:42:54  -- GPIO Output Value (0x17) 0x00
May 16 23:42:54  -- GPIO Input Value (0x18)  0x7f
May 16 23:42:54  -- Power Control (0x20)     0x19
May 16 23:42:54  -- Power Up Status (0x21)   0x68
May 16 23:42:54  -- Power Disable Status (0x22) 0x00
May 16 23:42:54  -- Power Disable Cause (0x23) 0x00
May 16 23:42:54  -- Power Volt Fail Status (0x24) 0x00
May 16 23:42:54  -- Power Volt Fail Cause (0x25) 0x00
May 16 23:42:54  -- Power Volt Fail Status (0x2A) 0x00
May 16 23:42:54  -- Power Volt Fail Cause (0x2B) 0x00
May 16 23:42:54  -- Misc DPC Status (0x2D)   0x00
May 16 23:42:54  -- Power Volt Fail Status (0x2E) 0x00
May 16 23:42:54  -- Power Volt Fail Cause (0x2F) 0x00
May 16 23:42:54  -- Sonet Status (0x34)      0x07
May 16 23:42:54  -- Sonet Status Enable (0x35) 0x00
May 16 23:42:54  -- Power Volt Fail Check Disable (0x3f) 0x00
May 16 23:42:54 CHASSISD_IPC_CONNECTION_DROPPED: Dropped IPC connection for FPC 0
May 16 23:42:54 CHASSISD_IFDEV_DETACH_FPC: ifdev_detach_fpc(0)
May 16 23:42:54  ifd lc-0/0/0 marked as gone
...
... logs cut for brevity
...
ay 16 23:42:54  ifd xe-0/3/1 marked as gone
May 16 23:42:54  FR: FPC 0 disconnected while no event in progress
May 16 23:42:54  FR: New event set to current event
May 16 23:42:54  FR: Dispatching current event
May 16 23:42:54  FR: Dispatching AFPC offline event for slot 0
May 16 23:42:54  FR: Moving to stage AFPC offline FM of event AFPC offline
May 16 23:42:54  FR: Waiting for ack(s) in AFPC offline FM stage of AFPC offline event
May 16 23:42:54  FM: No PFE boards to send plane control messages to
May 16 23:42:54  FM: Moving to stage Plane control of event PFE board offline
May 16 23:42:54  FM: Dispatching PFE board 0 offline, PFE mask 0x3
May 16 23:42:54  FM: Did Not send PB slot 0 OFFLINEplane control message
May 16 23:42:54  FM: Moving to stage Link stop of event PFE board offline
May 16 23:42:54  FM: Sending link train message to planes, type 326(stop)
May 16 23:42:54  FM: Waiting for ack(s) in Link stop stage of PFE board offline event
May 16 23:42:54  FM: Received link stop message for planes
May 16 23:42:54  FM: link train ...
May 16 23:42:54  FM: msg->pb=0
May 16 23:42:54  FM:  1 1 0 0
...
... logs cut for brevity
...
May 16 23:42:54  FM: Completing PFE board offline fabric event for slot 0
May 16 23:42:54  FM: End of PFE board offline fabric event for slot 0
May 16 23:42:54  FM: Send spare plane msg to all FPCs: mask: 0xf0
May 16 23:42:54  fpc_atlas_announce_offline
May 16 23:42:54  ch_neo_count[0] slot 0
May 16 23:42:54  rtsock_inform_fru_offline: inform peer type 17, index 0 offline, error 0
May 16 23:42:54  fpc_offline_now - slot 0, reason: None, error Chassis connection dropped
May 16 23:42:54  I2CS write cmd to FPC#0 [0x12], reg 0x12, cmd 0x0
May 16 23:42:54  I2CS write cmd to FPM#0 [0x8], reg 0x49, cmd 0x0
May 16 23:42:54  I2CS write cmd to FPM#0 [0x8], reg 0x4d, cmd 0x0
May 16 23:42:54  I2CS write cmd to FPM#0 [0x8], reg 0x49, cmd 0x0
May 16 23:42:54  I2CS write cmd to FPM#0 [0x8], reg 0x4d, cmd 0x0
May 16 23:42:54  I2CS write cmd to FPM#0 [0x8], reg 0x4b, cmd 0x0
May 16 23:42:54  I2CS write cmd to FPM#0 [0x8], reg 0x4f, cmd 0x0
May 16 23:42:54  hwdb: entry for fpc 2460 at slot 0 deleted
May 16 23:42:54  I2CS write cmd to CB#0 [0x10], reg 0x12, cmd 0x5
May 16 23:42:54  atlas_i2cs_fabric_active_led_on: CB#0 - active(blue) LED turned on
May 16 23:42:54  I2CS write cmd to CB#1 [0x11], reg 0x12, cmd 0x4
May 16 23:42:54  atlas_i2cs_fabric_active_led_off: CB#1 - active(blue) LED turned off
May 16 23:42:54  FR: End of AFPC offline event for slot 0.
May 16 23:42:54  FR: No more events to process
May 16 23:42:54  ipc pipe 0x8d49f00 created
May 16 23:42:54  ch_signal_proc: Sent signal 1 to tnp.sntpd, pid=1405
May 16 23:42:54  fpc 0 ready, pipe 0x0x8d49f00
May 16 23:42:54  FPC 0 ready dropped (msg-reconnect yes, reconnect-in-progress no)
May 16 23:42:54  I2CS write cmd to FPC#0 [0x12], reg 0x20, cmd 0x0
May 16 23:42:54  FPC#0 - power off [addr 0x12] reason: Error
May 16 23:42:54  power disable verified, FPC#0
...
... logs cut for brevity
...
May 16 23:49:04  All(4) PEMs are enhanced DC PEMs
May 16 23:49:04  FPC 0 is requesting power (consumption) 348W, total remaining pwr 1960
...
... etc...


More information about the juniper-nsp mailing list