[c-nsp] ECMP failing over time?

John Neiberger jneiberger at gmail.com
Sun Oct 3 01:10:21 EDT 2010


This is entirely multicast. We used the s-g-hash to lock each S,G to a
link, but we didn't think it through. We really should have started
out using the next-hop-based hash so that the same S,G can be served
by any link in the group. With s-g-hash, it always gets locked to the
same bundle.

However, I just thought of another potential culprit. I'm going to
have to think it through, though.

On Sat, Oct 2, 2010 at 10:17 PM, Keegan Holley
<keegan.holley at sungard.com> wrote:
> I've seen similar effects.  I'm not sure there's a method to evenly
> distribute traffic for an indefinite period.  I'm also not sure what you're
> routing, but the problems I've seen are usually caused by the fact that each
> flow/hash result differs in size and duration.  Adding extra variables to
> the equation always helps, but it's almost impossible to keep an even
> spread.  I suppose your current goal is to simply stop the outages though.
>
>
> On Sat, Oct 2, 2010 at 7:17 PM, John Neiberger <jneiberger at gmail.com> wrote:
>>
>> I hate to answer my own question, but I think I figured it out. We're
>> using s-g-hash basic, which is prone to polarization. I think that's
>> what we're seeing. Our traffic has become polarized and has developed
>> an affinity for a subset of links in our "bundles". I'm recommending
>> that we switch to s-g-hash next-hop-based to see if that resolves the
>> problem.
>>
>> On Sat, Oct 2, 2010 at 2:18 PM, John Neiberger <jneiberger at gmail.com>
>> wrote:
>> > We converted several connections last week from Etherchannels to
>> > routed links with ECMP. We verified that traffic was load-sharing over
>> > those links after making the change. Now, a week later, we are seeing
>> > instances where traffic is preferring one or two links out of each
>> > "bundle". In some cases all the traffic is flowing over a single link
>> > in a four-link setup. This is overloading those connections and we
>> > can't figure out why. We are using s-g-hash basic. Should we switch to
>> > s-g-hash next-hop-based?
>> >
>> > This is causing production issues right now, so I've opened up a TAC
>> > case, but I thought I'd ask here, as well, just in case someone had
>> > seen this before.
>> >
>> > Thanks,
>> > John
>> >
>> _______________________________________________
>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>>
>
>



More information about the cisco-nsp mailing list