[j-nsp] m10i Nastiness Friday night
Nilesh Khambal
nkhambal at juniper.net
Mon Aug 17 11:56:57 EDT 2009
It looks like CFEB dumped core and restarted. Please open a JTAC case
and let me them figure out what went wrong with CFEB. Please gather all
logs around the time of the problem. Usually following logs should be a
good start.
- show log messages[.(0-9).gz] (From RE)
- show syslog messages (from CFEB)
- show nvram (from CFEB).
- CFEB coredump file generated under "/var/tmp"
- Any other surrounding information such temperature, memory, CPU
information about RE and CFEB around the time of the problem.
Given the old version of code you are running on the box, this may be a
known issue fixed in later release such as 8.5 which you are running on
the other box. Let JTAC analyze that.
Thanks,
Nilesh.
Clue Store wrote:
> Hi All,
>
> Last friday we had some nastiness on one of our m10i's. As I am not a
> Juniper expert, I was wondering if someone could decipher the log messages
> and determine if is possibly a CFEB issue, or just a fluke Junos issue and
> whether I should do anything or let it be and see if it does it again. I
> have another m10i running 8.5, so I am thinking of just upgrading this box
> to the same as my other, but i'd like to hear what some of you on the list
> think.
>
> TIA,
> Clue
>
> Hostname: JuniperM10i-HMNDLAMA
> Model: m10i
> JUNOS Base OS boot [8.0R2.8]
> JUNOS Base OS Software Suite [8.0R2.8]
> JUNOS Kernel Software Suite [8.0R2.8]
> JUNOS Packet Forwarding Engine Support (M7i/M10i) [8.0R2.8]
> JUNOS Routing Software Suite [8.0R2.8]
> JUNOS Online Documentation [8.0R2.8]
>
>
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb mpc106 machine check caused by
> error on the Processor Bus
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb mpc106 PCI status register:
> 0x0020, error detect register 1: 0x00, 2: 0x08
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb mpc106 error ack count = 0
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb mpc106 error address: 0x0f3827f8
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb mpc106 Processor bus error status
> register: 0x52
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb transfer type 0b01010, transfer
> size 2
> Aug 14 23:38:51 JuniperM10i-HMNDLAMA cfeb mpc106 error detection reg2: ECC
> multibit
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb ^B
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb last message repeated 6 times
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Context: Interrupt Level (0)
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Registers:
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R00: 0x00000446 R01: 0x00799450
> R02: 0x00000000 R03: 0x4f3827fc
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R04: 0x00000552 R05: 0x00000000
> R06: 0x007994a0 R07: 0x00000004
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R08: 0x00000548 R09: 0x0017f48b
> R10: 0x00000002 R11: 0xb0c7d8ec
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R12: 0x28002044 R13: 0x02420020
> R14: 0xf1ae2100 R15: 0x82600020
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R16: 0x442104c2 R17: 0x2248000b
> R18: 0x00670000 R19: 0x00670000
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R20: 0x00670000 R21: 0x006ce5a0
> R22: 0x007902d0 R23: 0x00670000
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R24: 0x00000002 R25: 0x00000004
> R26: 0x0080bd40 R27: 0x0000ffff
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb R28: 0x00000001 R29: 0x00000001
> R30: 0x4f38271c R31: 0x4f382714
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb MSR: 0x00089030 CTR: 0x00000239
> Link:0x002e34c8 SP: 0x00799450
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb CCR: 0x48002028 XER: 0x20000000
> PC: 0x00460320
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb DSISR: 0x00000000 DAR: 0x00000000
> K_MSR: 0x00000030
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Stack Traceback:
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 01: sp = 0x00799450, pc =
> 0x0000c001
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 02: sp = 0x00799468, pc =
> 0x002e4d74
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 03: sp = 0x00799498, pc =
> 0x002e35e0
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 04: sp = 0x007994b8, pc =
> 0x002e3bb0
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 05: sp = 0x007994c0, pc =
> 0x00058818
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 06: sp = 0x007994d8, pc =
> 0x0003df34
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 07: sp = 0x00799500, pc =
> 0x003b4488
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 08: sp = 0x00799530, pc =
> 0x003b4660
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 09: sp = 0x00799548, pc =
> 0x003b3ed0
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 10: sp = 0x007995c8, pc =
> 0x003b3d30
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 11: sp = 0x007995e8, pc =
> 0x000b9f6c
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 12: sp = 0x00799610, pc =
> 0x000b8928
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 13: sp = 0x00799628, pc =
> 0x00448518
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 14: sp = 0x00799678, pc =
> 0x00442d00
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 15: sp = 0x00799698, pc =
> 0x0003a500
> Aug 14 23:38:52 JuniperM10i-HMNDLAMA cfeb Frame 16: sp = 0x007996b0, pc =
> 0x0003b268
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA /kernel: rdp keepalive expired,
> connection dropped - src 1:1021 dest 2:15360
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]: Major alarm set, CFEB
> not online, the box is not forwarding
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: CFEB
> color=RED, class=CHASSIS, reason=CFEB not online, the box is not forwarding
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 43
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_SHUTDOWN_NOTICE: Shutdown reason: CFEB connection lost
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(0)
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA mib2d[3111]: SNMP_TRAP_LINK_DOWN:
> ifIndex 77, ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/0/0
>
> (Lots of BGP notifications due to interface down issues)
>
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA snmpd[3132]: SNMPD_SEND_FAILURE:
> trap_io_send_trap_now: send to (207.29.223.55) failure: Network is down
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: shutting down chassisd
> connection: chassisd ipc pipe read error
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]:
> craftd_user_conn_shutdown: socket 5, errno = 0
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]: chassisd connection
> succeeded after 0 retries
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: chassisd connection
> succeeded after 0 retries
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA mib2d[3111]: SNMP_TRAP_LINK_DOWN:
> ifIndex 80, ifAdminStatus down(2), ifOperStatus down(2), ifName ge-1/0/0.462
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: resending alarm state
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 43
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: CFEB
> color=RED, class=CHASSIS, reason=CFEB not online, the box is not forwarding
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: RE color=RED,
> class=CHASSIS, reason=Host 0 fxp0: Ethernet Link Down
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 43
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: RE color=RED,
> class=CHASSIS, reason=Host 1 fxp0: Ethernet Link Down
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 43
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA /kernel: rdp keepalive expired,
> connection dropped - src 1:1020 dest 2:15361
> Aug 14 23:38:56 JuniperM10i-HMNDLAMA /kernel: pfe_listener_disconnect: conn
> dropped: listener idx=0, tnpaddr=0x2, reason: socket error
> Aug 14 23:39:41 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed
> [system warm]
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_SNMP_TRAP10:
> SNMP trap generated: FRU power on (jnxFruContentsIndex 6, jnxFruL1Index 1,
> jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName CFEB 0, jnxFruType 4,
> jnxFruSlot 1, jnxFruOfflineReason 2, jnxFruLastPowerOff 0, jnxFruLastPowerOn
> 0)
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(0)
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(1)
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_IFDEV_DETACH_ALL_PSEUDO: ifdev_detach(pseudo devices: all)
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA craftd[2999]: Major alarm cleared,
> Host 0 fxp0: Ethernet Link Down
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm cleared: RE
> color=RED, class=CHASSIS, reason=Host 0 fxp0: Ethernet Link Down
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 44
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot 0:
> CFEB not online, the box is not forwarding
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot 0:
> Host 0 fxp0: Ethernet Link Down
> Aug 14 23:40:09 JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot 1:
> Host 1 fxp0: Ethernet Link Down
> Aug 14 23:40:10 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_FRU_EVENT:
> fpc_m40_recv_restart: restarted FPC 0
> Aug 14 23:40:10 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_FRU_EVENT:
> fpc_m40_recv_restart: restarted FPC 1
> Aug 14 23:40:12 JuniperM10i-HMNDLAMA craftd[2999]: Major alarm set, Host 0
> fxp0: Ethernet Link Down
> Aug 14 23:40:12 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: RE color=RED,
> class=CHASSIS, reason=Host 0 fxp0: Ethernet Link Down
> Aug 14 23:40:12 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 43
> Aug 14 23:40:12 JuniperM10i-HMNDLAMA cfeb CM: ALARM CLEAR: Slot 0: Host 0
> fxp0: Ethernet Link Down
> Aug 14 23:40:17 JuniperM10i-HMNDLAMA craftd[2999]: Major alarm cleared,
> CFEB not online, the box is not forwarding
> Aug 14 23:40:17 JuniperM10i-HMNDLAMA alarmd[2998]: Alarm cleared: CFEB
> color=RED, class=CHASSIS, reason=CFEB not online, the box is not forwarding
> Aug 14 23:40:17 JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> request to chassisd: type = 4, subtype = 44
> Aug 14 23:40:17 JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot 0:
> Host 0 fxp0: Ethernet Link Down
> Aug 14 23:40:32 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_BLOWERS_SPEED: Fans and impellers are now running at normal speed
> Aug 14 23:40:33 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_FRU_EVENT:
> scb_recv_slot_attach: attached FPC 0
> Aug 14 23:40:55 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_FRU_EVENT:
> scb_recv_slot_attach: attached FPC 1
> Aug 14 23:40:57 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_SNMP_TRAP10:
> SNMP trap generated: FRU power on (jnxFruContentsIndex 8, jnxFruL1Index 1,
> jnxFruL2Index 1, jnxFruL3Index 0, jnxFruName PIC: 1x G/E, 1000 BASE-SX @
> 0/0/*, jnxFruType 11, jnxFruSlot 1, jnxFruOfflineReason 2,
> jnxFruLastPowerOff 0, jnxFruLastPowerOn 0)
> Aug 14 23:40:57 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_SNMP_TRAP10:
> SNMP trap generated: FRU power on (jnxFruContentsIndex 8, jnxFruL1Index 2,
> jnxFruL2Index 1, jnxFruL3Index 0, jnxFruName PIC: 1x G/E, 1000 BASE-SX @
> 1/0/*, jnxFruType 11, jnxFruSlot 2, jnxFruOfflineReason 2,
> jnxFruLastPowerOff 0, jnxFruLastPowerOn 0)
> Aug 14 23:40:57 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_IFDEV_CREATE_NOTICE: create_pics: created interface device for
> ge-0/0/0
> Aug 14 23:40:58 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_IFDEV_CREATE_NOTICE: create_pics: created interface device for
> ge-1/0/0
> Aug 14 23:40:58 JuniperM10i-HMNDLAMA chassisd[2997]: CHASSISD_SNMP_TRAP10:
> SNMP trap generated: FRU power on (jnxFruContentsIndex 7, jnxFruL1Index 1,
> jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName FPC: @ 0/*/*, jnxFruType 3,
> jnxFruSlot 1, jnxFruOfflineReason 2, jnxFruLastPowerOff 0, jnxFruLastPowerOn
> 0)
>
> (BGP notifications that peers are responding)
>
>
> Aug 14 23:42:22 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed
> [system warm]
> Aug 14 23:43:22 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_BLOWERS_SPEED: Fans and impellers are now running at normal speed
> Aug 14 23:44:02 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed
> [system warm]
> Aug 14 23:44:37 JuniperM10i-HMNDLAMA chassisd[2997]:
> CHASSISD_BLOWERS_SPEED: Fans and impellers are now running at normal speed
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
> .
>
More information about the juniper-nsp
mailing list