[c-nsp] ME3600 15.4(3)S5: env/power supply state machine broken

Lukas Tribus luky-37 at hotmail.com
Fri Mar 25 08:18:30 EDT 2016


Hey guys,


tl;dr a env/power state machine bug



anyone seen this before:

- input power on PS1 on a ME3600 in 15.4(3)S5 drops, the box asserts
  proper alarms (logging, SNMP, show env power)

- power comes back, but only "show env power" shows OK status - logging
  does not contain a "restored" event and SNMP still shows critical
  status for PS1


ME3600#show log | inc PLATFORM
00:00:01: %PLATFORM_ENV-1-FRU_PS_SIGNAL_FAULTY:  Input signal on power supply 1 is faulty
00:00:01: %PLATFORM_ENV-1-FRU_PS_SIGNAL_FAULTY:  Output signal on power supply 1 is faulty
11:33:22: %PLATFORM_ENV-1-FRU_PS_SIGNAL_OK: Output signal on power supply 1 is restored
11:33:22: %PLATFORM_ENV-1-FRU_PS_SIGNAL_OK: Input signal on power supply 1 is restored
11:33:22: %PLATFORM_ENV-1-FRU_PS_OK: power supply 1 is good
21:04:54: %PLATFORM_ENV-1-FRU_PS_SIGNAL_FAULTY:  Input signal on power supply 2 is faulty
21:05:11: %PLATFORM_ENV-1-FRU_PS_SIGNAL_OK: Input signal on power supply 2 is restored
21:05:11: %PLATFORM_ENV-1-FRU_PS_OK: power supply 2 is good
23:10:47: %PLATFORM_ENV-1-FRU_PS_SIGNAL_FAULTY:  Input signal on power supply 1 is faulty
ME3600#show env power
POWER SUPPLY 1 is DC OK
   DC Input  : OK
   Output    : OK
   Fan       : OK
POWER SUPPLY 2 is DC OK
   DC Input  : OK
   Output    : OK
   Fan       : OK

ME3600#


~$ snmpwalk -v 2c -c <secret> ME3600 1.3.6.1.4.1.9.9.13.1.5
iso.3.6.1.4.1.9.9.13.1.5.1.2.1037 = STRING: "Switch#1, PowerSupply 1"
iso.3.6.1.4.1.9.9.13.1.5.1.2.1040 = STRING: "Switch#1, PowerSupply 2"
iso.3.6.1.4.1.9.9.13.1.5.1.3.1037 = INTEGER: 3 <---- PS1: critical
iso.3.6.1.4.1.9.9.13.1.5.1.3.1040 = INTEGER: 1
iso.3.6.1.4.1.9.9.13.1.5.1.4.1037 = INTEGER: 3
iso.3.6.1.4.1.9.9.13.1.5.1.4.1040 = INTEGER: 3
~$

http://snmp.cloudapps.cisco.com/Support/SNMP/do/BrowseOID.do?local=en&translate=Translate&objectInput=iso.3.6.1.4.1.9.9.13.1.5.1.3


Happened two times in 15.4(3)S5 now (different boxes, different sites),
I never saw this behavior in older releases.


Toggling the input power on the affect PS restores the proper states.



Anyone seen this before? I'm opening a TAC case.



cheers,

lukas


 		 	   		  


More information about the cisco-nsp mailing list