[c-nsp] ARP strangeness

Rodney Dunn rodunn at cisco.com
Wed Jan 5 08:35:28 EST 2011



On 1/5/11 2:01 AM, Frank Bulk - iName.com wrote:
> Rodney:
>
> I can't recall seeing that ARP refresh documented anywhere, but it's
> obviously happening.

Yea. Arp is a lot more complicated than folks think. ;)


>
> I agree with you about the race condition.  Yesterday I set the FTTH MAC
> timeout to be 2 times the ARP refresh interval plus 60 seconds (to give me
> some room if the ARP unicast refresh doesn't happen exactly after 7
> minutes).

Right. Any time you have timers that pop on the same interval (say 60 
sec) it gets dangerous due to timer jitter. One could happen after the 
other. That's why historically we never recommended arp timers less than 
120 seconds as the CEF adj timer and arp timer ran on 60 second intervals.


>
> Broadcast ARPs, due to the FTTH's L2 security protection mechanisms, are not
> passed on.  Once the MAC entry ages out of the FTTH, it has to relearn it
> from the CPE (I think via ARP) before it will pass thru certain types of
> traffic.

Whoaa....ok. That means you could have a broken spoke if the main 
traffic is out towards the CPE and no CPE originated traffic. Unlikely 
I'm sure but not impossible.

>
> The rest of what you're describing about ARP expiration makes sense.
>

Ok. There were some arp code optimizations that I'd have to dig back up 
and see if it changed any of the legacy behavior but I don't think it did.

What is really needed here would be the span egress the 7600 port 
watching for those unicast refreshes. If that doesn't happen it's a bug 
on the 7600. If they do then it's likely an issue downstream.


Rodney


> Thanks,
>
> Frank
>
> -----Original Message-----
> From: Rodney Dunn [mailto:rodunn at cisco.com]
> Sent: Tuesday, January 04, 2011 8:01 PM
> To: frnkblk at iname.com
> Cc: 'Keegan Holley'; cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] ARP strangeness
>
> On 1/3/11 11:13 PM, Frank Bulk - iName.com wrote:
>> The 7609 does stop ARPing after receiving a reply from the CPE, but the
> 7609
>> ARPs again 7 minutes later.  One person told me off-list that Cisco
> doesn't
>> expire an ARP entry before checking its ARP entries by doing an ARP
> request.
>> Since ARP timeout is set for 8 minutes, perhaps Cisco's approach is to ARP
>> the host one minute before expiration.
>
> There were some changes to ARP at one point to provide some more
> triggered capability. I don't recall exactly what that was but the
> default behavior for many years was that we send a unicast arp to the
> destination 60 seconds prior to the arp timer set to expire. If we don't
> get a response we send it again when the timer pops and if no response
> we invalidate the ARP entry.
>
>>
>> Because the FTTH product has its own smart-L2 implementation with a MAC
>> address expiration time of 7 minutes, it won't forward directed or
> broadcast
>> ARP requests from the 7609 once the CPE's MAC address has expired from the
>> FTTH's MAC table.
>
> I wonder if you are missing one and then getting in to a race condition
> of the FTTH product not forwarding. If so it would seem you need to set
> the arp refresh down to a value less than the timeout of the transport gear.
>
>
>     Once the CPE goes deaf, even a full DHCP exchange doesn't
>> "wake up" the connection.  Only power-cycling the BEFRS41 resolves the
>> issue.  The difference between a power cycle and a full DHCP exchange is
>> that the BEFSR41 does an ARP request for the default router (7609) after
> the
>> DHCP exchange.
>
> If you lose the arp from the 7609 and you ping from it directly we
> should send a broadcast arp.
>
> Not sure where you are getting the packet capture but a span directly
> off the port is what you need to see if we don't send those two
> refreshes (one 60 seconds prior to the timer expiring and the second on
> the timer expiring). If we do send those and no response then data
> driven from the 7609 (packets headed towards that remote IP) should
> trigger the arp broadcast out again.
>
> Rodney
>
>
>>
>> I've extended the MAC address expiration time of the FTTH link to 15
> minutes
>> (two times the 7 minutes plus 1 minute) and we'll see how that goes.
>>
>> Frank
>>
>> -----Original Message-----
>> From: Keegan Holley [mailto:keegan.holley at sungard.com]
>> Sent: Monday, January 03, 2011 7:14 PM
>> To:<frnkblk at iname.com>
>> Cc:<cisco-nsp at puck.nether.net>
>> Subject: Re: [c-nsp] ARP strangeness
>>
>> The 7600 should stop arping if it has gotten a reply.  What happens when
> you
>> ping your test CPE both from the router and from another node behind it?
>> You can also run he command sh ip cef exact-match to see what the router
> is
>> doing with a specific source-destination ip pair. If nothing strange pops
> up
>> then it's probably a bug.
>>
>> Sent from my iPhone
>>
>> On Jan 3, 2011, at 12:58 PM, Frank Bulk<frnkblk at iname.com>   wrote:
>>
>>> We have over a thousand FTTH customers hanging off a VLAN on our 7609-S
>>> running 12.2SRE3.  Those who have Linksys BEFRS41 (wired-only routers)
> are
>>> complaining about lack of Internet access after many hours or days of
> idle
>>> time (not using their PC or other devices).  Those who have Linksys
> WRT54G
>>> (wireless) have no complaints (my guess is that they're sending packets
>> out
>>> regularly).
>>>
>>> We replicated this in our CO and put a hub between the ONT and the
> Linksys
>>> CPE so that we could capture those packets.  What we're seeing in that
>>> capture are directed ARP requests every 7 minutes from the 7609 to the
>>> Linksys with an ARP response from the Linksys.  After many hours, the
>> 7609-S
>>> stops sending the ARP requests (well, at least we're not seeing it come
>> in,
>>> perhaps it did try).
>>>
>>> We currently have our ARP timeout set to 480 seconds and MAC address
> table
>>> aging time to 540 seconds.  Why?  We use "mac-address-table synchronize"
>>> which is set to 160 seconds by default.  The recommendation from that
>>> command is to set ARP three times that, so that would be 480.  But it's
>> also
>>> recommended that the MAC address table aging time be greater than the ARP
>>> timeout, so we added another 60 seconds on top.
>>>
>>> Two questions:
>>> - why is the 7609 sending any directed ARP requests at all, every 7
>> minutes?
>>> - why does it appear to stop sending them after many hours?
>>>
>>> I'm all ears if we should be using different expiration values, but the
>>> numbers I'm using are based on reading a lot of cisco-nsp archives and
>> Cisco
>>> tech articles.
>>>
>>> Frank
>>>
>>> _______________________________________________
>>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>>
>>
>>
>> _______________________________________________
>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>


More information about the cisco-nsp mailing list