[c-nsp] VMware teaming Nic's and multiple switches

Fri Dec 14 08:40:39 EST 2012

On Thu, Sep 20, 2012 at 11:56 AM, Peter Rathlev <peter at rathlev.dk> wrote:
> On Thu, 2012-09-20 at 10:20 +0200, Gert Doering wrote:
>> It's easy if one of the physical links goes down ("do not use that!"),
>> but I'm thinking more about the uplink network getting partitioned, or
>> one of the uplink switches failing in interesting ways (link still up,
>> but no packets get forwarded anymore).  In Linux bonding, I can do that
>> by having the bonding driver send out ARP requests & monitor incoming
>> responses...
>
> VMware has something almost similar, although a little inferior, called
> "beaconing". With beaconing enabled, every link send out probes to every
> other link 10 times per second (IIRC) and every link expects to see
> these probes from all other links. If a link stops seeing the probes it
> is considered bad and is pulled from the pool of active links. As on can
> quickly see, this only works reliably with three or more links.
>
> If does not work as well as ARP probing since it doesn't actually test
> reachability towards the gateway, only reachability between the physical
> links. That means it cannot detect uplink failure in a scenario link
> this:
>
>     +-----------+       +-----------+
>     | L3 agg #1 |-------| L3 agg #2 |
>     +-----------+       +-----------+
>           |                    |
>           |                    |    <---- uplinks
>           |                    |
>     +-----------+       +-----------+
>     | Switch #1 |-------| Switch #2 |
>     +-----------+       +-----------+
>              \             /
>               \           /
>                \         /
>              +-------------+
>              | VMware host |
>              +-------------+
>
> ... since the connection between switches #1 and #2 forwards the probe
> frames fine. We use link state tracking to catch simple uplink failure
> to somewhat mitigate this.
>
> Beware that according to documentation it precludes the use of
> Etherchannels, though I don't know why. We don't use Etherchannels.
>
> On the positive side it should put a lot less load on the gateway(s)
> compared to ARP probes, since the RP no longer has to process anything.
>
> VMware ESXi 5 is relatively new to us, so I'm not sure all of this is
> still correct. But it should be easy to test with a SPAN session.

We are using ESXi 5 with "link status only" connected to a stack of
EX4200 switches. So far it worked very OK during switch upgrades when
rebooting them one by one - we are doing this in the middle of the day
and users don't notice anything. The hit i think it's under a second
or so because it's not noticeable.

With more switches in a distribution/aggregation scenario, you might
do uplink tracking and shut down interfaces if the ones designated as
uplink fail.

Eugeniu