[c-nsp] The mechanics of SSO

Jared Mauch jared at puck.nether.net
Wed May 6 16:39:40 EDT 2009


I would recommend trying to get the devices on SXF16 or SXI1 if  
possible.  You may need to send a break and interrupt the boot process  
on one (hope you have good OOB and know how to do this).

This is also reinforces the reason some people do not run dual  
processor systems.  They sometimes fail in really bad ways.

- Jared

On May 6, 2009, at 4:29 PM, Charles Wyble wrote:

> Ouch..... nasty race condition from the looks of it. Those little  
> corner cases that are oh so very sharp.
>
>
>
> Ross Vandegrift wrote:
>> Hey guys,
>> Today, due to what appears to be a major problem in SXF13, we
>> experienced two sequential crashes, taking out both SUPs in a 6500
>> within the time it takes to boot.  TAC case is going.
>> According to the crashinfo droppings left along the way, we
>> experienced three crashes:
>> 1) module 6 is active SUP, IOS crashes at 13:43
>> 2) module 5 takes over, IOS crashes at 13:52
>> 3) module 6 is still booting, IOS crashes at 13:52
>> The third crash is the perplexing one.  The RP crashinfo logs:
>> 	00:07:25: %CPU_MONITOR-STDBY-3-PEER_EXCEPTION: CPU_MONITOR peer  
>> has failed due to exception , reset by [6/0]
>> 	%Software-forced reload
>> The SP crashinfo says:
>> 	00:00:04: %PFREDUN-6-STANDBY: Initializing as STANDBY processor
>> 	[snip usual bootup messages]
>> 	00:01:39: SP-STDBY: SP: Currently running ROMMON from F1 region
>> 	00:01:42: %DIAG-SP-STDBY-6-RUN_MINIMUM: Module 6: Running Minimal  
>> Diagnostics...
>> 	00:02:03: %DIAG-SP-STDBY-6-DIAG_OK: Module 6: Passed Online  
>> Diagnostics
>> 	00:07:24: %PFREDUN-SP-STDBY-6-STANDBY: Failure of ACTIVE detected,  
>> STANDBY not ready and reset
>> 	%Software-forced reload
>> I guess this means there is a point in the bootup process where a
>> supervisor that is booting as a STANDBY cannot become ACTIVE without
>> restarting?
>> My guess is that this period is during the time the config is being
>> loaded from the ACTIVE module.  Can anyone confirm?  Are there things
>> that can make this potential window smaller? (compressed configs,
>> maybe)
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/



More information about the cisco-nsp mailing list