[j-nsp] m10i Nastiness Friday night

Clue Store cluestore at gmail.com
Mon Aug 17 16:02:53 EDT 2009


Thanks all for the replies. I'll get with JTAC and get or sorted out. As Dan
mentioned, the ECC multibit error kinda scares me as I do not wish to have
to drive 200+ miles and change out the memory. So lets hope for a Junos fix
:)

Thanks,
Clue

On Mon, Aug 17, 2009 at 12:19 PM, Dan Rautio <drautio at juniper.net> wrote:

> This message stands out:
>
> > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 error detection reg2:
> ECC multibit
>
>
>
>  > -----Original Message-----
> > From: juniper-nsp-bounces at puck.nether.net [mailto:juniper-nsp-
> > bounces at puck.nether.net] On Behalf Of Nilesh Khambal
> > Sent: Monday, August 17, 2009 10:57 AM
> > To: Clue Store
> > Cc: juniper-nsp at puck.nether.net
> > Subject: Re: [j-nsp] m10i Nastiness Friday night
> >
> > It looks like CFEB dumped core and restarted. Please open a JTAC case
> > and let me them figure out what went wrong with CFEB. Please gather all
> > logs around the time of the problem. Usually following logs should be a
> > good start.
> >
> > - show log messages[.(0-9).gz] (From RE)
> > - show syslog messages (from CFEB)
> > - show nvram (from CFEB).
> > - CFEB coredump file generated under "/var/tmp"
> > - Any other surrounding information such temperature, memory, CPU
> > information about RE and CFEB around the time of the problem.
> >
> > Given the old version of code you are running on the box, this may be a
> > known issue fixed in later release such as 8.5 which you are running on
> > the other box. Let JTAC analyze that.
> >
> > Thanks,
> > Nilesh.
> >
> > Clue Store wrote:
> > > Hi All,
> > >
> > > Last friday we had some nastiness on one of our m10i's. As I am not a
> > > Juniper expert, I was wondering if someone could decipher the log
> > messages
> > > and determine if is possibly a CFEB issue, or just a fluke Junos issue
> > and
> > > whether I should do anything or let it be and see if it does it again.
> I
> > > have another m10i running 8.5, so I am thinking of just upgrading this
> > box
> > > to the same as my other, but i'd like to hear what some of you on the
> > list
> > > think.
> > >
> > > TIA,
> > > Clue
> > >
> > > Hostname: JuniperM10i-HMNDLAMA
> > > Model: m10i
> > > JUNOS Base OS boot [8.0R2.8]
> > > JUNOS Base OS Software Suite [8.0R2.8]
> > > JUNOS Kernel Software Suite [8.0R2.8]
> > > JUNOS Packet Forwarding Engine Support (M7i/M10i) [8.0R2.8]
> > > JUNOS Routing Software Suite [8.0R2.8]
> > > JUNOS Online Documentation [8.0R2.8]
> > >
> > >
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 machine check caused
> > by
> > > error on the Processor Bus
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 PCI status register:
> > > 0x0020, error detect register 1: 0x00, 2: 0x08
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 error ack count = 0
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 error address:
> > 0x0f3827f8
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 Processor bus error
> > status
> > > register: 0x52
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb transfer type 0b01010,
> > transfer
> > > size 2
> > > Aug 14 23:38:51  JuniperM10i-HMNDLAMA cfeb mpc106 error detection reg2:
> > ECC
> > > multibit
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb ^B
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb last message repeated 6
> times
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Context: Interrupt Level (0)
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Registers:
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R00: 0x00000446 R01:
> > 0x00799450
> > > R02: 0x00000000 R03: 0x4f3827fc
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R04: 0x00000552 R05:
> > 0x00000000
> > > R06: 0x007994a0 R07: 0x00000004
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R08: 0x00000548 R09:
> > 0x0017f48b
> > > R10: 0x00000002 R11: 0xb0c7d8ec
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R12: 0x28002044 R13:
> > 0x02420020
> > > R14: 0xf1ae2100 R15: 0x82600020
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R16: 0x442104c2 R17:
> > 0x2248000b
> > > R18: 0x00670000 R19: 0x00670000
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R20: 0x00670000 R21:
> > 0x006ce5a0
> > > R22: 0x007902d0 R23: 0x00670000
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R24: 0x00000002 R25:
> > 0x00000004
> > > R26: 0x0080bd40 R27: 0x0000ffff
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb R28: 0x00000001 R29:
> > 0x00000001
> > > R30: 0x4f38271c R31: 0x4f382714
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb MSR: 0x00089030 CTR:
> > 0x00000239
> > > Link:0x002e34c8 SP:  0x00799450
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb CCR: 0x48002028 XER:
> > 0x20000000
> > > PC:  0x00460320
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb DSISR: 0x00000000 DAR:
> > 0x00000000
> > > K_MSR: 0x00000030
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Stack Traceback:
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 01: sp = 0x00799450,
> pc
> > =
> > > 0x0000c001
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 02: sp = 0x00799468,
> pc
> > =
> > > 0x002e4d74
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 03: sp = 0x00799498,
> pc
> > =
> > > 0x002e35e0
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 04: sp = 0x007994b8,
> pc
> > =
> > > 0x002e3bb0
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 05: sp = 0x007994c0,
> pc
> > =
> > > 0x00058818
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 06: sp = 0x007994d8,
> pc
> > =
> > > 0x0003df34
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 07: sp = 0x00799500,
> pc
> > =
> > > 0x003b4488
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 08: sp = 0x00799530,
> pc
> > =
> > > 0x003b4660
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 09: sp = 0x00799548,
> pc
> > =
> > > 0x003b3ed0
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 10: sp = 0x007995c8,
> pc
> > =
> > > 0x003b3d30
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 11: sp = 0x007995e8,
> pc
> > =
> > > 0x000b9f6c
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 12: sp = 0x00799610,
> pc
> > =
> > > 0x000b8928
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 13: sp = 0x00799628,
> pc
> > =
> > > 0x00448518
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 14: sp = 0x00799678,
> pc
> > =
> > > 0x00442d00
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 15: sp = 0x00799698,
> pc
> > =
> > > 0x0003a500
> > > Aug 14 23:38:52  JuniperM10i-HMNDLAMA cfeb Frame 16: sp = 0x007996b0,
> pc
> > =
> > > 0x0003b268
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA /kernel: rdp keepalive expired,
> > > connection dropped - src 1:1021 dest 2:15360
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]:  Major alarm set,
> > CFEB
> > > not online, the box is not forwarding
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: CFEB
> > > color=RED, class=CHASSIS, reason=CFEB not online, the box is not
> > forwarding
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 43
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_SHUTDOWN_NOTICE: Shutdown reason: CFEB connection lost
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(0)
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA mib2d[3111]: SNMP_TRAP_LINK_DOWN:
> > > ifIndex 77, ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/0/0
> > >
> > > (Lots of BGP notifications due to interface down issues)
> > >
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA snmpd[3132]: SNMPD_SEND_FAILURE:
> > > trap_io_send_trap_now: send to (207.29.223.55) failure: Network is down
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: shutting down
> > chassisd
> > > connection: chassisd ipc pipe read error
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]:
> > > craftd_user_conn_shutdown: socket 5, errno = 0
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]: chassisd connection
> > > succeeded after 0 retries
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: chassisd connection
> > > succeeded after 0 retries
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA mib2d[3111]: SNMP_TRAP_LINK_DOWN:
> > > ifIndex 80, ifAdminStatus down(2), ifOperStatus down(2), ifName ge-
> > 1/0/0.462
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: resending alarm
> > state
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 43
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: CFEB
> > > color=RED, class=CHASSIS, reason=CFEB not online, the box is not
> > forwarding
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: RE
> > color=RED,
> > > class=CHASSIS, reason=Host 0 fxp0: Ethernet Link Down
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 43
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: RE
> > color=RED,
> > > class=CHASSIS, reason=Host 1 fxp0: Ethernet Link Down
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 43
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA /kernel: rdp keepalive expired,
> > > connection dropped - src 1:1020 dest 2:15361
> > > Aug 14 23:38:56  JuniperM10i-HMNDLAMA /kernel: pfe_listener_disconnect:
> > conn
> > > dropped: listener idx=0, tnpaddr=0x2, reason: socket error
> > > Aug 14 23:39:41  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed
> > > [system warm]
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_SNMP_TRAP10:
> > > SNMP trap generated: FRU power on (jnxFruContentsIndex 6, jnxFruL1Index
> > 1,
> > > jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName CFEB 0, jnxFruType 4,
> > > jnxFruSlot 1, jnxFruOfflineReason 2, jnxFruLastPowerOff 0,
> > jnxFruLastPowerOn
> > > 0)
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(0)
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(1)
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_IFDEV_DETACH_ALL_PSEUDO: ifdev_detach(pseudo devices: all)
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA craftd[2999]: Major alarm
> cleared,
> > > Host 0 fxp0: Ethernet Link Down
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm cleared: RE
> > > color=RED, class=CHASSIS, reason=Host 0 fxp0: Ethernet Link Down
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 44
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot
> > 0:
> > > CFEB not online, the box is not forwarding
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot
> > 0:
> > > Host 0 fxp0: Ethernet Link Down
> > > Aug 14 23:40:09  JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot
> > 1:
> > > Host 1 fxp0: Ethernet Link Down
> > > Aug 14 23:40:10  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_FRU_EVENT:
> > > fpc_m40_recv_restart: restarted FPC 0
> > > Aug 14 23:40:10  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_FRU_EVENT:
> > > fpc_m40_recv_restart: restarted FPC 1
> > > Aug 14 23:40:12  JuniperM10i-HMNDLAMA craftd[2999]:  Major alarm set,
> > Host 0
> > > fxp0: Ethernet Link Down
> > > Aug 14 23:40:12  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm set: RE
> > color=RED,
> > > class=CHASSIS, reason=Host 0 fxp0: Ethernet Link Down
> > > Aug 14 23:40:12  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 43
> > > Aug 14 23:40:12  JuniperM10i-HMNDLAMA cfeb CM: ALARM CLEAR: Slot 0:
> Host
> > 0
> > > fxp0: Ethernet Link Down
> > > Aug 14 23:40:17  JuniperM10i-HMNDLAMA craftd[2999]: Major alarm
> cleared,
> > > CFEB not online, the box is not forwarding
> > > Aug 14 23:40:17  JuniperM10i-HMNDLAMA alarmd[2998]: Alarm cleared: CFEB
> > > color=RED, class=CHASSIS, reason=CFEB not online, the box is not
> > forwarding
> > > Aug 14 23:40:17  JuniperM10i-HMNDLAMA craftd[2999]: forwarding display
> > > request to chassisd: type = 4, subtype = 44
> > > Aug 14 23:40:17  JuniperM10i-HMNDLAMA cfeb CM: ALARM SET: (Major) Slot
> > 0:
> > > Host 0 fxp0: Ethernet Link Down
> > > Aug 14 23:40:32  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_BLOWERS_SPEED: Fans and impellers are now running at normal
> > speed
> > > Aug 14 23:40:33  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_FRU_EVENT:
> > > scb_recv_slot_attach: attached FPC 0
> > > Aug 14 23:40:55  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_FRU_EVENT:
> > > scb_recv_slot_attach: attached FPC 1
> > > Aug 14 23:40:57  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_SNMP_TRAP10:
> > > SNMP trap generated: FRU power on (jnxFruContentsIndex 8, jnxFruL1Index
> > 1,
> > > jnxFruL2Index 1, jnxFruL3Index 0, jnxFruName PIC: 1x G/E, 1000 BASE-SX
> @
> > > 0/0/*, jnxFruType 11, jnxFruSlot 1, jnxFruOfflineReason 2,
> > > jnxFruLastPowerOff 0, jnxFruLastPowerOn 0)
> > > Aug 14 23:40:57  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_SNMP_TRAP10:
> > > SNMP trap generated: FRU power on (jnxFruContentsIndex 8, jnxFruL1Index
> > 2,
> > > jnxFruL2Index 1, jnxFruL3Index 0, jnxFruName PIC: 1x G/E, 1000 BASE-SX
> @
> > > 1/0/*, jnxFruType 11, jnxFruSlot 2, jnxFruOfflineReason 2,
> > > jnxFruLastPowerOff 0, jnxFruLastPowerOn 0)
> > > Aug 14 23:40:57  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_IFDEV_CREATE_NOTICE: create_pics: created interface device for
> > > ge-0/0/0
> > > Aug 14 23:40:58  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_IFDEV_CREATE_NOTICE: create_pics: created interface device for
> > > ge-1/0/0
> > > Aug 14 23:40:58  JuniperM10i-HMNDLAMA chassisd[2997]:
> > CHASSISD_SNMP_TRAP10:
> > > SNMP trap generated: FRU power on (jnxFruContentsIndex 7, jnxFruL1Index
> > 1,
> > > jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName FPC:  @ 0/*/*, jnxFruType
> > 3,
> > > jnxFruSlot 1, jnxFruOfflineReason 2, jnxFruLastPowerOff 0,
> > jnxFruLastPowerOn
> > > 0)
> > >
> > > (BGP notifications that peers are responding)
> > >
> > >
> > > Aug 14 23:42:22  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed
> > > [system warm]
> > > Aug 14 23:43:22  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_BLOWERS_SPEED: Fans and impellers are now running at normal
> > speed
> > > Aug 14 23:44:02  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed
> > > [system warm]
> > > Aug 14 23:44:37  JuniperM10i-HMNDLAMA chassisd[2997]:
> > > CHASSISD_BLOWERS_SPEED: Fans and impellers are now running at normal
> > speed
> > > _______________________________________________
> > > juniper-nsp mailing list juniper-nsp at puck.nether.net
> > > https://puck.nether.net/mailman/listinfo/juniper-nsp
> > > .
> > >
> >
> >
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp at puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>


More information about the juniper-nsp mailing list