[j-nsp] CFEB restart due to incorrect temperature reading ?
Alexandre Snarskii
snar at snar.spb.ru
Wed Jul 14 11:18:16 EDT 2010
Hi!
Some days ago one of our M10i's CFEB restarted with strange diagnostics -
in the same second when BFD sessions get down there were following
line in chassisd log:
Jul 2 14:03:08 CFEB 0 temperature is -60 degrees C, which is outside operating
range
(not sure if it appeared before or after BFD failures though,
these lines found in different log files. BFD messages timestamped
as 14:03:08.695, and no messages logged to remote syslog server since
14:01, so it's quite possible that CFEB failed first).
Then there are lots of messages about CFEB not running/FPC restarts
and so on messages, ends with CFEB restart and finally everything
stabilised:
Jul 2 14:03:12.663 2010 alarmd[1278]: Alarm set: CFEB color=RED, class=CHASSIS, reason=CFEB not online, the box is not forwarding
Jul 2 14:03:12.672 2010 /kernel: rdp keepalive expired, connection dropped - src 0x00000001:1021 dest 0x00000002:2048
Jul 2 14:03:12.663 2010 chassisd[1277]: CHASSISD_SHUTDOWN_NOTICE: Shutdown reason: CFEB connection lost
Jul 2 14:03:12.669 2010 craftd[1279]: Major alarm set, CFEB not online, the box is not forwarding
There are no coredumps and no logs about reason of restart,
just power-on notice in internal CFEB log:
[0+00:00:00.255 LOG: Info] CSBR: Reset reason (0x4): Power On
[0+00:00:00.255 LOG: Info] On-board NVRAM contains diagnostic information.
and last logs in NVRAM is just
CSBR: Reset reason (0x4): Power On
Routing engine was not reset during this failure, and temperature
readings shows just normal datacenter temperature of 22C/71F...
Any ideas on what had happened ?
More information about the juniper-nsp
mailing list