[j-nsp] MX960 transient errors on high capacity AC power supplies

Karl Gerhard karl_gerh at gmx.at
Wed Sep 5 09:09:11 EDT 2018


Hello,

we're using 17.3R2 with 2x MPC 3D 16x 10GE.
PR1312336 looks very interesting, thank you.

Regards
Karl



------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
*From:* zh73 [mailto:zh73 at 163.com]
*Sent:* Wed, Sep 5, 2018 2:22 AM CEST
*To:* Karl Gerhard
*Cc:* juniper-nsp at puck.nether.net
*Subject:* [j-nsp] MX960 transient errors on high capacity AC power supplies

> What type of MPCs are you using? which junos release?
> Better upgrade to 17.3R3 which have PR1312336, PR1325271, PR1349179 fix.  
> Or open a case to JTAC.
>
> At 2018-09-04 15:38:33, "Karl Gerhard" <karl_gerh at gmx.at> wrote:
> >Hello,
> >
> >we have bought two Juniper MX960 and we're having serious trouble with power supplies triggering alarms and then clearing alarms a few seconds later:
> >2x RE-S-X6-64G
> >3x SCBE-2-MX
> >MX960 Premium 3 chassis
> >4x High Capacty AC PEMs
> >
> >$ show log messages | match alarmd
> >Aug  30 08:09:02  router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  30 08:09:12  router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  30 08:12:30  router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  30 08:12:35  router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  30 08:14:53  router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  30 08:14:58  router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
>>> >$ show log messages
> >Aug  31 06:12:33  router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, group=0x3, device=0x51)
> >Aug  31 06:13:29  router1 kernel: %KERN-3: PCF8584(RD): target ack timeout
> >Aug  31 06:13:29  router1 kernel: %KERN-3: PCF8584(RD): (i2c_s1=0x08, group=0x3, device=0x51)
> >Aug  31 06:13:29  router1 kernel: %KERN-3: PCF8584(WR): target ack failure on byte 0
> >Aug  31 06:13:29  router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, group=0x3, device=0x51)
> >Aug  31 06:13:50  router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  31 06:13:50  router1 craftd[12162]: %DAEMON-4:  Major alarm set, PEM 1 Not OK
> >Aug  31 06:13:50  router1 kernel: %KERN-3: PCF8584(WR): target ack failure on byte 0
> >Aug  31 06:13:50  router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, group=0x3, device=0x51)
> >Aug  31 06:13:50  router1 kernel: %KERN-3: PCF8584(WR): target ack failure on byte 1
> >Aug  31 06:13:50  router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, group=0x3, device=0x51)
> >Aug  31 06:13:50  router1 chassisd[12159]: %DAEMON-4-CHASSISD_PEM_INPUT_BAD: status failure for power supply 1 (status bits: 0x0); check circuit breaker
> >Aug  31 06:13:55  router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr supply color=RED, class=CHASSIS, reason=PEM 1 Not OK
> >Aug  31 06:13:55  router1 craftd[12162]: %DAEMON-4: Major alarm cleared, PEM 1 Not OK
> >
> >Oddly enough the errors show up only every few weeks. The power supplies work for weeks without a hitch and then start throwing alerts for a day or a few days and then stop throwing alerts and work flawlessly again for a few weeks.
> >
> >We've checked and swapped everything. It's not the cables, not the connectors, not the power source.
> >Then we started sending power supplies back to our supplier. But the errors keep showing up even with brand new, swapped power supplies.
> >We've found PR1299284 which seems to be related to non-hc power supplies.
> >
> >Could those errors be related to a software problem which affects RE-S-X6-64G/SCBE-2-MX in combination with High Capacity AC PEMs?
> >Anyone else experienced errors like that?
> >
> >Regards
> >Karl
> >
> >_______________________________________________
> >juniper-nsp mailing list juniper-nsp at puck.nether.net
> >https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
>  
>



More information about the juniper-nsp mailing list