[j-nsp] Alarm for non-existant PEM
James S. Smith
JSmith at WindMobile.ca
Mon Jun 20 22:11:50 EDT 2011
Interesting, I'd have to agree. I do find it a bit odd that it lasted well over an hour before it care to start shutting things down. I would have expected it immediately shutdown after it lost power to the first PEM. Live and learn...another strike against that particular reseller that sold us this SRX. Since I've been dealing with Juniper directly and doing my own research I haven't had any gotchas.
James S. Smith, Network and Security Architect, Juniper Networks Certified Associate
WIND Mobile 207 Queen's Quay West, Suite 710 Toronto, ON M5J 1A7
Email: JSmith at WindMobile.ca
Direct: 416-640-9792
-----Original Message-----
From: juniper-nsp-bounces at puck.nether.net [mailto:juniper-nsp-bounces at puck.nether.net] On Behalf Of Martin T
Sent: June 20, 2011 8:12 PM
To: Stacy W. Smith
Cc: juniper-nsp at puck.nether.net
Subject: Re: [j-nsp] Alarm for non-existant PEM
I'm rather sure that Stacy is correct. In addition, for example M10i,
which has 4 PSU's as well, requires at least two of them to be
operational at a time.
regards,
martin
2011/6/21 Stacy W. Smith <stacy at acm.org>:
> I think this is expected.
>
> http://www.juniper.net/techpubs/en_US/release-independent/junos/topics/concept/ac-power-supply-srx3600-overview.html
>
> While the wording is poor, I interpret that documentation to say that you must have at least two functional AC power supplies to power the chassis. When you removed power to PEM 0, PEM 2 by itself was insufficient to power the chassis.
>
>> Jun 19 01:31:00 send: red alarm set, device PEM 3, reason Too many PEMs missing
>
> I think the "PEM 3" part of this message is somewhat misleading. The important part is the the "Too many PEMs missing".
>
>> Jun 19 01:31:00 WARNING: not enough AC power supply (1), required 2
>
> This log message seems to support my interpretation of the documentation. At least two AC power supplies are required to power the chassis.
>
> --Stacy
>
> On Jun 20, 2011, at 2:40 PM, James S. Smith wrote:
>
>> Just wondering if anyone has ever seen this. We have an SRX3600 with two PEMs. The show chassis environment looks like this normally:
>>
>> node0:
>> --------------------------------------------------------------------------
>> Class Item Status Measurement
>> Temp PEM 0 OK
>> PEM 1 Absent
>> PEM 2 OK
>> PEM 3 Absent
>>
>>
>> We had some maintenance on the weekend and PEM0 was going to lose power. No issue, since PEM2 is there and has never had a problem. The first part of the power outage went fine. Then we had to have another power blip. During the second power blip the chassis started complaining that PEM3 also had a problem and there wasn't enough PEMs to keep everything powered. We've never had a PEM3. At this point it started shutting down power to the FPCs.
>>
>>> show log chassisd | match "Jun 19"
>> Jun 19 00:02:59 send: red alarm set, device PEM 0, reason PEM 0 Not OK
>> Jun 19 00:02:59 CHASSISD_PEM_INPUT_BAD: status failure for power supply 0 (status bits: 0x4); check circuit breaker
>> Jun 19 00:03:05 send: red alarm clear, device PEM 0, reason PEM 0 Not OK
>> Jun 19 01:31:00 send: red alarm set, device PEM 0, reason PEM 0 Not OK
>> Jun 19 01:31:00 send: red alarm set, device PEM 3, reason Too many PEMs missing
>> Jun 19 01:31:00 I2CS write cmd to FPC#0 [0x0], reg 0x7e, cmd 0x25
>> Jun 19 01:31:00 I2CS write cmd to FPC#0 [0x0], reg 0x12, cmd 0x5
>> Jun 19 01:31:00 I2CS write cmd to CB#0 [0x16], reg 0x13, cmd 0x9
>> Jun 19 01:31:00 a10_i2cs_assert_hard_reset: CB#0 - hard reset
>> Jun 19 01:31:00 recb_power_down_cpps: CPP in slot 0 is powered down
>> Jun 19 01:31:00 WARNING: not enough AC power supply (1), required 2
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 lcc_a40_update_ch_info_on_fpc_detach: On SCC: recvd fpc detach for fpc 0 belonging to ch_id 1, marking pics empty in ch_info
>> Jun 19 01:31:01 ch_info_pic_state_blob_set: ch_id 1, addflag 0x0 key 0x1000006 num_spu 1 num_npc 1 app_mode 0x0 ioc_npc_map 0xaa ch_info_version 0x1
>> Jun 19 01:31:01 PIC[1][0] => ioc PIC Up
>> Jun 19 01:31:01 PIC[7][0] => cp-flow combo PIC Up
>> Jun 19 01:31:01 PIC[10][0] => npc PIC Up
>> Jun 19 01:31:01 ch_info_update: UPDATE ch 1, fpc_slot 0, pic_slot 0, state 0x0 SUCCEEDED
>> Jun 19 01:31:01 ch_info_update: fpc_slot 0, pic_slot 0, ioc_npc_map 0xa
>> Jun 19 01:31:01 CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(13)
>> Jun 19 01:31:01 ifd ge-13/0/7 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/8 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/9 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/10 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/11 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/0 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/1 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/2 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/3 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/4 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/5 marked as gone
>> Jun 19 01:31:01 ifd ge-13/0/6 marked as gone
>> Jun 19 01:31:01 fpc detach 13
>> Jun 19 01:31:01 Clearing scc context
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 PIC (fpc 13 pic 0) message operation: change. ifd count 0, flags 0 in mesg
>> Jun 19 01:31:01 Time to clean up PIC FPC 13, PIC 0
>> Jun 19 01:31:01 Clearing scc context
>> Jun 19 01:31:01 ignoring PIC message on LCC
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 lcc_a40_update_ch_info_on_fpc_detach: On SCC: recvd fpc detach for fpc 1 belonging to ch_id 1, marking pics empty in ch_info
>> Jun 19 01:31:01 ch_info_pic_state_blob_set: ch_id 1, addflag 0x0 key 0x1000006 num_spu 1 num_npc 1 app_mode 0x0 ioc_npc_map 0xaa ch_info_version 0x1
>> Jun 19 01:31:01 PIC[7][0] => cp-flow combo PIC Up
>> Jun 19 01:31:01 PIC[10][0] => npc PIC Up
>> Jun 19 01:31:01 ch_info_update: UPDATE ch 1, fpc_slot 1, pic_slot 0, state 0x0 SUCCEEDED
>> Jun 19 01:31:01 ch_info_update: fpc_slot 1, pic_slot 0, ioc_npc_map 0xa
>> Jun 19 01:31:01 CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(14)
>> Jun 19 01:31:01 ifd xe-14/0/0 marked as gone
>> Jun 19 01:31:01 ifd xe-14/0/1 marked as gone
>> Jun 19 01:31:01 fpc detach 14
>> Jun 19 01:31:01 Clearing scc context
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 PIC (fpc 13 pic 0) message operation: delete. ifd count 0, flags 0 in mesg
>> Jun 19 01:31:01 pic_handle_message_idl: PIC fpc 13 pic 0 got deleted
>> Jun 19 01:31:01 Clearing scc context
>> Jun 19 01:31:01 ignoring PIC message on LCC
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 lcc_a40_update_ch_info_on_fpc_detach: On SCC: recvd fpc detach for fpc 7 belonging to ch_id 1, marking pics empty in ch_info
>> Jun 19 01:31:01 ch_info_pic_state_blob_set: ch_id 1, addflag 0x0 key 0x1000006 num_spu 1 num_npc 1 app_mode 0x0 ioc_npc_map 0xaa ch_info_version 0x1
>> Jun 19 01:31:01 PIC[10][0] => npc PIC Up
>> Jun 19 01:31:01 ch_info_update: UPDATE ch 1, fpc_slot 7, pic_slot 0, state 0x0 SUCCEEDED
>> Jun 19 01:31:01 ch_info_update: fpc_slot 7, pic_slot 0, ioc_npc_map 0x0
>> Jun 19 01:31:01 CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(20)
>> Jun 19 01:31:01 fpc detach 20
>> Jun 19 01:31:01 Clearing scc context
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 PIC (fpc 14 pic 0) message operation: change. ifd count 2, flags 0x2 in mesg
>> Jun 19 01:31:01 Clearing scc context
>> Jun 19 01:31:01 Setting scc context
>> Jun 19 01:31:01 lcc_a40_update_ch_info_on_fpc_detach: On SCC: recvd fpc detach for fpc 10 belonging to ch_id 1, marking pics empty in ch_info
>> Jun 19 01:31:01 ch_info_pic_state_blob_set: ch_id 1, addflag 0x0 key 0x1000006 num_spu 1 num_npc 1 app_mode 0x0 ioc_npc_map 0xaa ch_info_version 0x1
>> Jun 19 01:31:01 ch_info_update: UPDATE ch 1, fpc_slot 10, pic_slot 0, state 0x0 SUCCEEDED
>> Jun 19 01:31:01 CHASSISD_IFDEV_DETACH_FPC: ifdev_detach(23)
>> Jun 19 01:31:02 fpc detach 23
>> Jun 19 01:31:02 Clearing scc context
>> Jun 19 01:31:02 Setting scc context
>> Jun 19 01:31:02 PIC (fpc 14 pic 0) message operation: change. ifd count 1, flags 0x2 in mesg
>> Jun 19 01:31:02 Clearing scc context
>> Jun 19 01:31:02 ignoring PIC message on LCC
>> Jun 19 01:31:02 Setting scc context
>> Jun 19 01:31:02 PIC (fpc 14 pic 0) message operation: change. ifd count 0, flags 0 in mesg
>> Jun 19 01:31:02 Time to clean up PIC FPC 14, PIC 0
>> Jun 19 01:31:02 Clearing scc context
>> Jun 19 01:31:02 ignoring PIC message on LCC
>> Jun 19 01:31:02 Setting scc context
>> Jun 19 01:31:02 PIC (fpc 14 pic 0) message operation: delete. ifd count 0, flags 0 in mesg
>> Jun 19 01:31:02 pic_handle_message_idl: PIC fpc 14 pic 0 got deleted
>> Jun 19 01:31:02 Clearing scc context
>> Jun 19 01:31:05 send: red alarm clear, device PEM 0, reason PEM 0 Not OK
>>
>>
>>
>> James S. Smith, Network and Security Architect, Juniper Networks Certified Associate
>> WIND Mobile 207 Queen's Quay West, Suite 710 Toronto, ON M5J 1A7
>>
>> Email: JSmith at WindMobile.ca
>> Direct: 416-640-9792
>>
>>
>>
>>
>>
>>
>> This message contains confidential information and is intended only for the individual named. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
>>
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
More information about the juniper-nsp
mailing list