Re: [nsp] Weird behaviour of Cat4000

From: Gert Doering (gert@greenie.muc.de)
Date: Mon Apr 22 2002 - 17:02:32 EDT

Next message: Ian Cox: "Re: [nsp] TOS/DSCP marking on 6509 SUP"
Previous message: Siva Valliappan: "Re: [nsp] Goodie download?"
In reply to: Jan-Ahrent Czmok: "[nsp] Weird behaviour of Cat4000"
Next in thread: Jan-Ahrent Czmok: "[nsp] Weird behaviour of Cat4000"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hi,

On Mon, Apr 22, 2002 at 08:46:16PM +0200, Jan-Ahrent Czmok wrote:
> I have a strange behavior on my Cat4000's :
>
> connected a sniffer on one port (housing) and i can see traffic from the other customers of ours.
> Normally i should only see my traffic (no i am not talking about broadcast or multicast).

Depending on the setup (if you have multiple switches, multiple *routers*,
and highly asymetric traffic) it is possible that one of the switches
doesn't know the proper port for a given MAC address, and has to resort to
flooding.

Imagine:

     -- RA ---- RB --
         | |
        S1 ---- S2-- Host

Packets come in via Router A (from "the left" uplink), go to switch S1,
because the network containing "Host" is "connected".

Switch S1 sends packet to Switch S2 (*), Switch S2 sends packet to Host.

Host replies, and has (HSRP striking here) Router RB as default gateway
(who will forward the packet to RA via his direct link RA-RB). The switch
S2 does know the MAC adress of RB, so packets from "Host" are NOT flooded.

--> Switch 1 never sees a packet *from* host "Host", and thus can not know
that "Host" is connected to S2. So what S1 *has* to do is to flood all
packets from RA->Host.

If Host will ARP for something, or send out any kind of Broadcast packet,
S1 will learn. In the case of a well-behaving unix host with a lengthy
ARP cache timeout and few ethernet adjacencies, it *will* happen that the
CAM table on S1 will time out quicker than the ARP cache on Host, and
will NOT be refreshed (due to direct sending Host -> RB).

We had the scenario, and it is not easily solveable. Our solution was to
run "rwhod" on all Unix machines. One broadcast every 30 seconds, CAM
table refresh, and no flooding :-)

Every way to change topology means "give up redundancy" or "just move the
problem around until it manifests for some other host in some other
traffic pattern".

There is no "proper" solution with the way switched Ethernet works today -
it's intrinsically broken. My favourite idea would be to have a kind of
"layer 2 SPF" protocol spoken on inter-switch links that will proper
flood links and MAC addresses, and will (as a side effect) do away with
"blocked" links due to STP (and use all available links efficiently).
It's tricky, but I'm convinced it's doable (loop avoidance and duplicate
prevention being done by MAC RPF, as in multicast flooding), but
unfortunately there is no implementation yet - and I doubt there will be
one. Switches are tricky business, and this will cost real money to
develop.

gert

-- 
USENET is *not* the non-clickable part of WWW!
                                                           //www.muc.de/~gert/
Gert Doering - Munich, Germany                             gert@greenie.muc.de
fax: +49-89-35655025                        gert.doering@physik.tu-muenchen.de

Next message: Ian Cox: "Re: [nsp] TOS/DSCP marking on 6509 SUP"
Previous message: Siva Valliappan: "Re: [nsp] Goodie download?"
In reply to: Jan-Ahrent Czmok: "[nsp] Weird behaviour of Cat4000"
Next in thread: Jan-Ahrent Czmok: "[nsp] Weird behaviour of Cat4000"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2b29 : Sun Aug 04 2002 - 04:13:12 EDT