[j-nsp] SSB/RE Problem

Elian Scrosoppi escrosoppi at ifxcorp.com
Thu Jun 15 14:10:40 EDT 2006

Hi guys,

Yesterday we had some problems with the SSB/RE of our M20 router. I have extracted the following logs and information. Anyone can help me to determine the problem?

Model: m20
2 RE
JUNOS Base OS boot [7.0R2.7]
JUNOS Base OS Software Suite [7.0R2.7]
JUNOS Kernel Software Suite [7.0R2.7]
JUNOS Packet Forwarding Engine Support (M20/M40) [7.0R2.7]
JUNOS Routing Software Suite [7.0R2.7]
JUNOS Online Documentation [7.0R2.7]
JUNOS Crypto Software Suite [7.0R2.7]

content of /var/log/mastership :

Mar 16 14:30:17 event = E_NO_IPC, state = backup, param = 0x0x0
Mar 16 14:30:17 No response from the other routing engine for the last 2 seconds.

Mar 16 14:30:17 Currentstate backup NextState backup reason_code 0
Mar 16 14:30:17 new state = backup
Mar 16 14:30:17 Keepalive timeout of 2 seconds expired.  Assuming RE mastership.

Mar 16 14:30:17 event = E_CMD_F, state = backup, param = 0x0x0
Mar 16 14:30:20 The local RE becomes the master, retry = 0.
Mar 16 14:30:20 Currentstate backup NextState master reason_code 2
Mar 16 14:30:20 timestamp: Thu Mar 16 14:30:20 2006
Mar 16 14:30:20 new state = master

(lot of this)
Mar 16 14:30:26 failed to send RE info/keepalive: errno=0, total=2 in the last 20 sec
Mar 16 14:30:26 failed to send RE info/keepalive: errno=65, total=2 in the last 20 sec
Mar 16 14:30:40 failed to receive keepalives from other RE for the last 20 sec


Mar 16 14:35:37 received version 1, "claim mastership" request
Mar 16 14:35:37 event = E_REQ_C, state = master, param = 0x0x0
Mar 16 14:35:37 send "claim mastership" negative acknowledgement
Mar 16 14:35:37 Currentstate master NextState master reason_code 1
Mar 16 14:35:37 new state = master
Mar 16 14:36:02 event = E_ORE_B, state = master, param = 0x0x835cca8
Mar 16 14:36:02 Currentstate master NextState master reason_code 1
Mar 16 14:36:02 new state = master
Jun 14 15:43:07 event = E_ORE_M, state = master, param = 0x0x835cca8
Jun 14 15:43:07 Duplicate Master Routing Engine
Jun 14 15:43:07 mcontrol_disabled_exit
Jun 14 15:43:07 mcontrol_shutdown
Jun 14 15:43:07 mcontrol_notmaster
Jun 14 15:43:10 *** mcontrol init V01 ***
Jun 14 15:43:10 soft-restart: is not a master
Jun 14 15:43:10 Socket = 0x00000011
Jun 14 15:43:10 event = E_CFG_B, state = init, param = 0x0x0
Jun 14 15:43:10 Currentstate init NextState backup reason_code 0
Jun 14 15:43:10 new state = backup

content of /var/log/messages :

Jun 14 15:42:15  JM20 init: mib-process (PID 13778) terminated by signal number 15!
Jun 14 15:42:15  JM20 init: ntp (PID 13776) exited with status=0 Normal Exit
Jun 14 15:42:15  JM20 init: chassis-control (PID 2578) exited with status=6
Jun 14 15:42:15  JM20 init: chassis-control (PID 13816) started
Jun 14 15:42:15  JM20 init: failure target for routing set to target 1
Jun 14 15:42:15  JM20 init: routing (PID 13779) SIGTERM sent
Jun 14 15:42:15  JM20 init: failure target for routing set to target 1
Jun 14 15:42:15  JM20 init: routing (PID 13779) SIGTERM sent
Jun 14 15:42:15  JM20 chassisd[13816]: CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(0)
Jun 14 15:42:15  JM20 chassisd[13816]: CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(1)
Jun 14 15:42:15  JM20 rpd[13779]: RPD_SIGNAL_TERMINATE: second termination signal received
Jun 14 15:42:15  JM20 chassisd[13816]: CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(2)
Jun 14 15:42:15  JM20 chassisd[13816]: CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(3)
Jun 14 15:42:15  JM20 chassisd[13816]: CHASSISD_IFDEV_DETACH_ALL_PSEUDO: ifdev_detach(pseudo devices: all)
Jun 14 15:42:15  JM20 rpd[13779]: RPD_EXIT: Exit rpd[13779] version 7.0R2.7 built by builder on 2005-01-06 06:58:43 UTC, caller 80b14c3
Jun 14 15:42:16  JM20 alarmd[2579]: chassisd connection succeeded after 1 retries
Jun 14 15:42:16  JM20 craftd[2580]: chassisd connection succeeded after 1 retries
Jun 14 15:42:16  JM20 alarmd[2579]: resending alarm state
Jun 14 15:42:16  JM20 init: routing (PID 13779) exited with status=0 Normal Exit
Jun 14 15:42:17  JM20 syslogd: sendto: No route to host
Jun 14 15:42:17  JM20 craftd[2580]: attempt to delete alarm not in list
Jun 14 15:42:17  JM20 craftd[2580]: forwarding display request to chassisd: type = 4, subtype = 44
Jun 14 15:42:23  JM20 /kernel: mastership: routing engine 1 becoming master
Jun 14 15:42:23  JM20 /kernel: mastership: routing engine 1 becoming master
Jun 14 15:42:23  JM20 rshd[13894]: root at re0 as root: cmd='rcp -T -f /var/db/dcd.snmp_ix'
Jun 14 15:42:24  JM20 syslogd: sendto: No route to host
Jun 14 15:42:24  JM20 chassisd[13816]: CHASSISD_SNMP_TRAP10: SNMP trap generated: redundancy switchover (jnxRedundancyContentsIndex 6, jnxRedundancyL1Index 1, jnxRedundancyL2Index 0, jnxRedundancyL3Index 0, jnxRedundancyDescr SSB 0, jnxRedundancyConfig 2, jnxRedundancyState 2, jnxRedundancySwitchoverCount 1, jnxRedundancySwitchoverTime 777978144, jnxRedundancySwitchoverReason 2)
Jun 14 15:42:24  JM20 chassisd[13816]: CHASSISD_SNMP_TRAP10: SNMP trap generated: redundancy switchover (jnxRedundancyContentsIndex 6, jnxRedundancyL1Index 2, jnxRedundancyL2Index 0, jnxRedundancyL3Index 0, jnxRedundancyDescr SSB 1, jnxRedundancyConfig 3, jnxRedundancyState 3, jnxRedundancySwitchoverCount 1, jnxRedundancySwitchoverTime 777978144, jnxRedundancySwitchoverReason 2)
Jun 14 15:42:24  JM20 init: failure target for routing set to target 1
Jun 14 15:42:24  JM20 init: interface-control (PID 13814) terminate signal sent
Jun 14 15:42:24  JM20 init: ntp (PID 13897) started
Jun 14 15:42:24  JM20 init: snmp (PID 13898) started
Jun 14 15:42:24  JM20 init: mib-process (PID 13899) started
Jun 14 15:42:24  JM20 init: routing (PID 13900) started
Jun 14 15:42:24  JM20 init: sonet-aps (PID 13901) started
Jun 14 15:42:24  JM20 init: vrrp (PID 13902) started
Jun 14 15:42:24  JM20 init: sntpsync (PID 13810) SIGTERM sent
Jun 14 15:42:24  JM20 init: pfe (PID 13813) terminate signal sent
Jun 14 15:42:24  JM20 init: sampling (PID 13903) started
Jun 14 15:42:24  JM20 init: ilmi (PID 13904) started
Jun 14 15:42:24  JM20 init: remote-operations (PID 13905) started
Jun 14 15:42:24  JM20 init: class-of-service (PID 13906) started
Jun 14 15:42:24  JM20 init: network-access (PID 13907) started
Jun 14 15:42:24  JM20 init: ipsec-key-management (PID 13908) started
Jun 14 15:42:24  JM20 init: helper (PID 13909) started
Jun 14 15:42:24  JM20 init: remote-hello (PID 13910) started
Jun 14 15:42:24  JM20 init: link-management (PID 13911) started
Jun 14 15:42:24  JM20 init: kernel-replication (PID 13811) SIGTERM sent
Jun 14 15:42:24  JM20 init: firewall (PID 13815) terminate signal sent
Jun 14 15:42:24  JM20 init: internal-routing-service (PID 13912) started
Jun 14 15:42:24  JM20 init: routing-socket-proxy (PID 13913) started
Jun 14 15:42:24  JM20 init: pic-services-logging (PID 13914) started
Jun 14 15:42:24  JM20 init: adaptive-services (PID 13915) started
Jun 14 15:42:24  JM20 init: pgm (PID 13916) started
Jun 14 15:42:24  JM20 init: neighbor-liveness (PID 13917) started
Jun 14 15:42:24  JM20 init: service-deployment (PID 13918) started
Jun 14 15:42:24  JM20 init: failure target for routing set to target 1
Jun 14 15:42:24  JM20 init: interface-control (PID 13814) terminate signal sent
Jun 14 15:42:24  JM20 init: sntpsync (PID 13810) SIGTERM sent
Jun 14 15:42:24  JM20 init: pfe (PID 13813) terminate signal sent
Jun 14 15:42:24  JM20 init: kernel-replication (PID 13811) SIGTERM sent
Jun 14 15:42:24  JM20 init: firewall (PID 13815) terminate signal sent
Jun 14 15:42:24  JM20 init: firewall (PID 13815) exited with status=0 Normal Exit
Jun 14 15:42:24  JM20 init: firewall (PID 13920) started
Jun 14 15:42:24  JM20 init: pfe (PID 13813) terminated by signal number 15!
Jun 14 15:42:24  JM20 init: pfe (PID 13921) started
Jun 14 15:42:24  JM20 init: kernel-replication (PID 13811) exited with status=0 Normal Exit
Jun 14 15:42:24  JM20 init: sntpsync (PID 13810) terminated by signal number 15!
Jun 14 15:42:24  JM20 chassisd[13816]: snmp_ipc_try_connect: connect to master (unix sock) failed: Connection refused, retry in 1

The uptime of the SSB was 1 minute, and RE's was not affected. Then, one hour after, this problem have repeated.

If more information is needed please tell me.

Thanks in advance,

Elian Scrosoppi
escrosoppi at ifxcorp.com

More information about the juniper-nsp mailing list