[j-nsp] Network, trouble after customer created a loop *inside* a VM host
Jeff Meyers
Jeff.Meyers at gmx.net
Fri Nov 7 09:18:35 EST 2014
Hello everybody,
I'm writing to this list because I can't seem to find the reason for
what we saw twice meanwhile. Here is the setup:
Juniper MX480 no RSTP
||
ae0
||
Juniper EX4550 VC RSTP bridge id 0
||
ae0
||
Juniper EX4200 VC RSTP bridge id 16k
|
ProCurve 2824 RSTP bridge id 32k
|
Windows Host
So the router itself is not part of the Spanning-Tree, everything below
is. On the Windows host, the customer is running ESXi with just one
uplink towards the HP ProCurve switch so there is not even a real danger
for a physical loop. Now: on the host are two VMs running. Each of them
has a virtual NIC which is bridged to the physical one of the host.
Because of a mistake, the customer accidentally bridged his two VMs
together as well which caused a loop inside the Host. So far, so good.
The trouble begins at this point because immediately we saw partial
network outages resulting in router messages like this:
Nov 7 14:30:47 cr0 l2ald[2545]: L2ALD_MAC_MOVE_NOTIFICATION: MAC Moves
detected in the system
This message repeated over and over and the ARP counter decreased
continueously. Host flapped and vanished for seconds or minutes and
internal smokeping measured a lot of loss.
The HP ProCurve logged only excessive broadcast for the customer port
and that's it. Spanning-Tree didn't recognize anything. The same applies
to the EX4200 VC and the EX4550 VC: nothing was detected by the loop
preventing procotol and it was only a lucky shot, that we knew where to
look because the customer called by phone and told us, what he did.
The question is: how can that be and what can I do?
On the EX-series switches, each downlink port is configured with
set protocols rstp interface ge-0/0/0 no-root-port
storm-control is enabled on all ports with 85% (but none was detected).
There is no special configuration on the ProCurve besides the general
RSTP activation (which is set to RSTP and not STP).
So can anybody help with that? I am really stuck here.. :(
Thanks in advance,
Jeff
More information about the juniper-nsp
mailing list