[f-nsp] [FastIron] RIP routes does not get into CAM

Youssef Ghorbal youssef.ghorbal at gmail.com
Sat Oct 2 21:26:23 EDT 2010


On Sat, Oct 2, 2010 at 11:51 PM, Heath Jones <hj1980 at gmail.com> wrote:
>> >  I have a FastIron 800, learning routes from a RIP neighbor and having
>> > some directly attached networks.
>> >  I suspect that the network is facing an IP scan for all my prefixes'
>> > IPs and that is seeming to get the Fastiron a little distirbed :
>> >  - Directly attached networks have a PING RTT around 1ms which is normal.
>> >  - Networks learned from RIP have a PING RTT around 30ms which is not
>> > normal. The RIP neighbor is directly attached to the FastIron and the
>> > link is tested and good (direct PING between neighbors is <1ms)
>
> Hi Youssef,

Hi Heath,

> I'm not familar with this architecture specifically to that level, but
> I'll throw in what I do know generically.
> Apologies in advance if you already know this..
>
> CAM is used for MAC address matching, TCAM is used for IP prefix matching.
> The distinction is important, given the scenario you think is occurring.
>
> If you have IP packets coming in from a single path, the MAC address
> should not change.
> Also, I would imagine this device already knows the MAC addresses for
> all of your local neighboring devices. Therefore there should be no
> issue with the CAM.

That's what I think too.

> The TCAM is based on routing information, it doesn't learn in the same
> way. It is built from all of your RIP routes for instance and will not
> dynamically change depending on source / destination traffic.

What I see from experiences I run today, the TCAM get filled with
routes as soon as the packets flowed.
step 1 : TCAM empty
$> show cam ip 3/12
Slot Index      IP_Address            MAC        Age    VLAN    Out Port
step 2 : a ping from a machine that is connected behind port 3/12 to a
some prefix (lets say 192.168.9.3)
step 3 : the TCAM get filled with a route to the 192.168.9.0/24
$> show cam ip 3/12 192.168.9.0 255.255.255.255
Slot Index      IP_Address            MAC        Age    VLAN    Out Port
  3  10348      192.168.9.0/24  001b.ed24. 5     0     128        mgmt

Of course 192.168.9.0/24 is a prefix learnt from RIP. but when the
entry aged it disappears from the TCAM.
The first packet will require more processing (lookup, TCAM update)
but the rest of the flow will require no processing.

> To understand the problem a bit better, what is the topology?
> You have this device (A) connected to another device (B), that is
> advertising RIP prefixes to (A).
> Pinging A-B is <1ms, pinging anything beyond B from A results in 30ms?
> How utilised is the link from B to beyond? Is it a single ethernet
> with dot1q going out to an edge?
> Running through my mind is perhaps CPU is being chewed on that/those
> edge devices, with ICMP replies, or ICMP unreachables etc...

The topology is quite simple, two Devices (A and B) running both Layer
2 and Layer 3 and exchanging routing information using RIP. The two
are connected with a single ethernet link.
Network prefixes are distributed on both devices in a dummy way
(192.168.0.1/24 is a ve on device A, 192.168.1.1/24 is a ve on device
B... etc) with a future vrrp setup in mind.
In Layer 2, A and B act as core switches where access switchs get connected.
In Layer 3, A and B act as default gateways for prefixes.
A is the core switch of Building A and B is the core switch of
Building B and both are connected with a fiber.

I succeded to reproduce the ping RTT issue in this scenrio :
M1 : 192.168.1.3/24 is in building A (with a default gateway on B)
M2 : 192.168.2.3/24 is in building A (with a default gateway on B)
M3 : 192.168.3.4/24 is in building A (with a default gateway on A)
M4 : 192.168.3.3/24 is in building A

t=0 : M3 ping M2 and RTT is around 1ms.
t=1 : M1 init a big FTP transfer (at a rate of 50Mb/s) to M4
t=2 : M3 ping M2 and RTT is around 30ms !
t=3 : M1 stops FTP transfer
t=4 : M3 ping M2 and RTT is back to 1ms.

In an nutshell, under heavy load, the flow :
machine 1 -> A(L3) -> B(L3) -> A(L2) -> machine 2 is very very slow.
machine 1 -> A(L3) -> machine 2 is fast
machine 1 -> A(L3) -> B(L3) -> machine 2 is fast
machine 1 -> B(L3) -> machine is fast
A -> B is fast
A(L3) -> B(L3) -> machine is fast (a ping from the switch itself)
B(L3) -> A(L3) -> machine is fast
A(L3) -> B(L3) -> A(L2) -> machine is fast when source interface of
the ping is the one between A and B otherwise it's very slow
B(L3) -> A(L3) -> B(L2) -> machine is fast whatever the source IP used
for the ping

I don't know what other tests I can do to isolate the problem :(

Youssef Ghorbal

PS : A is a FastIron 800 and B is a MLX-16




More information about the foundry-nsp mailing list