[c-nsp] Cisco 3020 blade switches hung, HLFM errors, network meltdown?

Wed Mar 19 12:48:31 EDT 2008

I have an HP blade system with two WS-CBS3020-HPQ switches.  Console
logged the following error during which the entire network was unreachable:

(6444)msecs, more than (2000)msecs (719/326),process = HLFM address 
learning process.
-Traceback= 4794B0 479A4C 4799B0 2E9E64 4F788C 32D6C4 11B980 11BEF0 
11D684 326BC4 322F90
323240 A86D34 A7D2FC
18w0d: %SYS-3-CPUHOG: Task is running for (2152)msecs, more than (2000)msecs
(143/1),process = HLFM address learning process.

With their uplinks to the network disabled, the switches were still 
unreachable/unusable, even through Fa0/0.  I had to reboot each before I 
could telnet back in.

Disconnecting them from the network brought the network back, 
reconnecting melted the network.

Felt like a broadcast storm or even a spanning-tree loop but I'd be 
surprised if it was the latter and the upstream switches, two 6500s, 
didn't know how to do deal with that (heck, they deal with HP 2510s that 
default to not running spanning-tree).

 From some of the log entries I could gleam from the console buffer, it 
looks like the native vlan on one of the port channel members was 
inadvertently changed and was marked as incompatible with the other 
bundle member.  Still, I'm somewhat surprised that that hung the blade 
switches to the extent that everything else became  unusable.

Any insights?

[I'm stuck in this place where the WS-CBS3020-HPQ's aren't registered 
with CCO and my reseller says I have to talk to HP for support...]