[c-nsp] protecting cisco switches

Mark Messier mark at messier.com
Mon Oct 8 01:09:11 EDT 2007


Revisiting this Aug 31 topic...   my simple set-up just
blew up again.  I've got this:

   [c4948-L]----[c4948-R]  both with 12.2(25)EWA7
     \           /
      \         /
       \       /
        \     /
         \   /
        [hp2848]   all with I.10.43

There are many similar [hp2848] as top-of-rack (access) switches.
VLANs span multiple access switches.   The trunk between the two
c4948 is pruned to only the vlans of interest.

Timing is consistent among all devices
(Hello Time 2 sec, Max Age 20 sec, Forward Delay 15 sec).

Following the last discussion, I have this configured:

   udld aggressive
   spanning-tree mode rapid-pvst
   spanning-tree loopguard default
   spanning-tree portfast bpduguard default
   spanning-tree vlan XY priority 0 (4096 on other switch)

   On the trunk between cisco switches:

    spanning-tree port-priority 0

   On ports down to HP-switches:

    spanning-tree port-priority 16 (32 on other switch)

   HSRP primary matches STP root.

   All HP switches have "bpdu guard" on all ports, except for
   the two uplink ports, which are equal cost and priority.

   All inactive cisco ports are shut.

   All hp2848 run the same new firmware (upgraded after last failure)
   and are configured identically.

I enable the redundant set-up by unshutting the ports on c4948-L.
I verify on each HP that the correct uplink is Forwarding and the
other is Blocked.   I wait and watch...  it's stable.

15 hours later I get this (HP-80 is in vlan15):

09:28:22 [HP-80] port 47 is Blocked by STP
09:28:23 [HP-80] port 48 is now on-line
09:28:23 [HP-80] RSTP Root changed from 15:0019e8-ee5240 to  
4111:0019e8-e8ea00
09:28:24 c4948-R %SPANTREE-2-LOOPGUARD_BLOCK: Loop guard blocking  
port GigabitEthernet1/44 on VLAN0015.
09:28:52 [HP-82] port 45 is now on-line
09:28:52 [HP-85] port 47 is now on-line
09:28:52 [HP-86] port 47 is now on-line
09:28:52 [HP-87] port 47 is now on-line
09:28:52 [HP-80] port 47 is now on-line
09:28:53 [HP-84] port 47 is now on-line
09:28:53 [HP-85] port 47-Excessive Broadcasts. See help.
09:28:53 [HP-85] port 48-Excessive Broadcasts. See help.
09:28:53 [HP-88] port 45 is now on-line
09:28:53 c4948-R %C4K_EBM-4-HOSTFLAPPING: Host 00:19:E8:EE:52:7F in  
vlan 6 is flapping between port Gi1/14 and port Gi1/44
09:28:53 c4948-R %C4K_EBM-4-HOSTFLAPPING: Host 00:14:22:11:62:FB in  
vlan 6 is flapping between port Gi1/44 and port Gi1/14
09:28:53 c4948-R %C4K_EBM-4-HOSTFLAPPING: Host 00:14:22:7C:5F:2C in  
vlan 6 is flapping between port Gi1/44 and port Gi1/14
09:28:53 c4948-R %C4K_EBM-4-HOSTFLAPPING: Host 00:14:22:7C:5F:2C in  
vlan 6 is flapping between port Gi1/14 and port Gi1/44
  <network continues to melt>

Now, GigabitEthernet1/44 is the trunk between cisco switches... so
that loopguard blocking triggers all the HP switches to bring their
backup ports online.  And things go bad from there with, I think,
so much activity that bpdu cannot flow normally.

c4948-L, was not able to syslog anything, but the console shows
C4K_EBM-4-HOSTFLAPPING messages.  I shut down all the downlink ports
to the HP switches, so the network was being carried only by c4948-R.

Nevertheless, c4948-L was screwed.  It was in a state where nothing  
useful would
work.  The relevant ports were up, including one ptp ethernet, and I
could ping internal addresses but nothing external, not even the far
side of the ptp ethernet.  No extranal ARP entries were present.

The box had memory, logged nothing else and
otherwise looked normal.  I let it sit for hours.  I shut/no-shut
ports.  No change.  Finally, I rebooted and it came back to life.

Meanwhile, elsewhere, I've got this exact same physical set-up working
for over a year.  No glitches.  HP firmware all over the map, no cisco
protection enabled, only the STP root nailed down.  This is
what I had in the scenario above... before the previous failure that
prompted me to be more pro-active about STP.

Clearly the HP triggered the event when it shut down the root port.
OK, so it's a pos.   But I should be able to protect the cisco layer...
shouldn't I?

I mean, it's hard enough to conclude that STP doesn't do what it is  
supposed to
do and I have to nail down as much as possible to give it a chance.   
But, this is
pretty simple, and the 4948 is a good engine.  It can handle this...   
I'm just
one tweak away.. right?

Thanks,
-mark



More information about the cisco-nsp mailing list