[c-nsp] Unicast traffic being sent to every port? Aging issue?

Mon Mar 22 23:34:04 EDT 2010

On 3/22/2010 11:14 PM, Ray Van Dolson wrote:
> On Mon, Mar 22, 2010 at 08:04:10PM -0700, Jay Hennigan wrote:
>    
>> On 3/22/10 7:03 PM, Ray Van Dolson wrote:
>>      
>>> We have two Dell PowerConnect M6220 switches (A1 and B1).  They are not
>>> cross-connected, but both have uplinks to the same subnet:
>>>
>>>                        zfs1
>>>                       /
>>>                     +----+
>>>                     | A1 |---------|
>>>                     +----+     +-------+
>>>                                | Cisco |------- linux1
>>>                     +----+     +-------+
>>>                     | B1 |---------|
>>>                     +----+
>>>                      / \
>>>                    esx1 esx2
>>>
>>> There's a host hanging off of A1 (zfs1) and several ESX hosts hanging
>>> off of B1 (esx1, esx2, etc).  There's a host linux1 hanging off the
>>> Cisco as well (actually many hosts, but for the sake of description
>>>
>>> What's happening is, esx1/2 beging talking to zfs1.  All is well for a
>>> while... but at some point, zfs1's MAC address expires from the CAM on
>>> the switch (I guess that is what is happening).
>>>
>>> At that point, the Cisco begins forwarding the unicast packets to all
>>> its ports.  The result -- linux1, and all other hosts see the packets.
>>> Occasionally, when we're dealing with a lot of traffic, this seriously
>>> impacts performance.
>>>        
>> Is the Cisco a router or a layer 2 switch?  All hosts in the same IP
>> subnet?  Subnet masks all match?  Nothing doing proxy-arp?
>>
>>      
>>> My question here is.. what is the _right_ way to deal with this?  This
>>> "flooding" can continue for many minutes at a time.. it isn't until an
>>> ARP reply eminates from zfs1 that the CAM table is populated again and
>>> the broadcasting stops.
>>>        
>> If these are layer 2 switches, ARP won't have anything to do with it.
>>
>> If zfs1's MAC expires from the MAC address table on the cisco, it will
>> flood the next packet for that MAC.  A1 will forward it to zfs1 or flood
>> if it too has expired the MAC.
>>
>> When zfs1 replies, A1 forwards the reply to the cisco.  At that point,
>> the cisco should re-install the MAC into its address table and the
>> flooding cease.
>>
>> This should happen with a single packet.
>>
>> Does this happen with any other hosts behind A1?  Any interface errors
>> on any of the devices?
>>
>>      
>>> I wonder if zfs1 would send back an ARP response quicker were it not
>>> behind an additional switch (the PowerConnect)...
>>>        
>> If layer 2 switches, ARP doesn't have anything to do with it.
>>      
> I'll have to find out how the Cisco's are configured.  I wouldn't be
> surprised if they're doing some Layer 3 though as I know some VLAN
> routing is going on...
>
> The Dell switches both seem to have "Routing Mode" enabled as well (but
> proxy arp disabled).
>
> There currently aren't any other hosts behind A1, but that would be a
> good test.  No interface errors currently.
>
> Firmware is old on A1, so at this point I'm a little suspicious it's to
> blame.
>
> Just wanted to try and wrap my head around this first.
>
> Thanks,
> Ray
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
>
>    
In other multivendor LAN setups, We've noticed similar behavior and 
enjoyed some success by synching the arp timers. That's worth a look if 
you haven't already followed that line of investigation.