[j-nsp] Verifying Juniper ECMP

philxor at gmail.com philxor at gmail.com
Sat Aug 9 06:57:04 EDT 2014


Hey Chris, I have done a bit of testing with ECMP.  


It is absolutely stateless with no caching.  If you add new links existing flows will be rebalanced across them immediately.  This is whether it is standard L3 ECMP or an aggregate bundle, it works the same in either case.  Also like you said given the same input criteria traffic will output to the same link.   The exception is if you are using adaptive load balancing where it will tweak the hash in real time based on interface load to try and spread traffic more evenly when you have large flows.  


Most of the platforms have 64-way ECMP.  You have to explicitly configure it as 16, 32, or 64.  I am not aware of a single doc listing maximums, but the “maximum-ecmp” command has some info.  


Phil









From: Chris Woodfield
Sent: ‎Friday‎, ‎August‎ ‎8‎, ‎2014 ‎4‎:‎56‎ ‎PM
To: Masood Ahmad Shah
Cc: juniper-nsp at puck.nether.net List





Hi,

Just noticed this thread because I had similar questions. Follow-up questions inline:

On Apr 19, 2014, at 3:54 AM, Masood Ahmad Shah <masoodnt10 at gmail.com> wrote:

> See inline, prefixed [Masood] ...
> 
> 
> On Thu, Apr 17, 2014 at 1:09 AM, John Neiberger <jneiberger at gmail.com>wrote:
> 
>> ​Another question: if a link in a ECMP "bundle" goes down and then comes
>> back up later, do things end up hashed and balanced the same way they were
>> prior to the link going down, or is there some amount of randomness to it?
>> 
> 
> [Masood] You may not see traffic balanced instantly, because existing flow
> will NOT move to the newly added member. Only new flows will get hashed
> across the members and then new member will have his fair share of good
> luck :) However, the following things may happen and make load balancing
> more fun:
> 
> 1. incorrect load balancing by aggregate next hops
> 2. incorrect packet hash computation
> 3. insufficient variance in the packet flow
> 4. incorrect pattern selection
> 

So let's say I'm standing up a set of servers downstream from the device that are all handling TCP traffic for a single VIP and advertising that IP upstream. If I stood up a new device, I'd expect that if the hash was stateless, packets going to pre-existing servers would now go to the new device, breaking those TCP sessions. Are you saying that's not the case? If so, does that mean that the flow-to-next-hop mappings are cached?

If true, this is actually good news, and I'd love to see if someone from Juniper can verify that this is the case across most/all Juniper platforms (MX, EX, T, ...) or point me to docs that show how different platforms handle ECMP internally.

> You may look for "Adaptive Load Balancing", a Juniper method to balance
> traffic across LAG members (that focus more on the weights, the bandwidth
> and packet stream of link) but that has it's on pros and cons.
> 
> 
>> If I check a certain flow and see that it is hashed to a particular link,
>> is it a fair bet that it was hashed to that same link prior to the link
>> going down?
>> 
> 
> [Masood] AFAIK, #Junos does not keep track of it and I wonder if any other
> vendor would do that.
> 
> 

I'd expect that even without tracking, a stateless hashing algorithm would come up with the same answer giving the same input. So as long as the next-hops are identical and the incoming packet has the same src/dst address and ports, the outgoing next-hop shouldn't change. The question here, based on the above comment, is whether this hashing is, in fact, stateless, or not.

Followup question: Does Juniper have a doc that lists the different maximum number of paths available for ECMP on various platforms?

Thanks,

-C


>> 
>> Thanks,
>> John​
>> 
>> 
>> On Tue, Apr 15, 2014 at 12:07 PM, John Neiberger <jneiberger at gmail.com
>>> wrote:
>> 
>>> Holy cow. I never would have figured that one out, and the two Juniper
>>> engineers I asked had no idea how to do it. I appreciate the help!
>>> 
>>> Thanks,
>>> John
>>> 
>>> 
>>> On Tue, Apr 15, 2014 at 3:50 AM, Olivier Benghozi <
>>> olivier.benghozi at wifirst.fr> wrote:
>>> 
>>>> Hi John,
>>>> 
>>>> as usual with Juniper it's ridiculously overcomplicated, David Roy wrote
>>>> a fine article about that, at least for MX with DPC:
>>>> 
>>>> 
>> http://www.junosandme.net/article-junos-load-balancing-part-3-troubleshooting-109382234.html
>>>> 
>>>> 
>>>> Olivier
>>>> 
>>>> Le 15 avr. 2014 à 04:01, John Neiberger <jneiberger at gmail.com> a écrit
>> :
>>>>> ​I know that ECMP is, by default, based on a hash of source and
>>>> destination
>>>>> IP address, and I know that we can see the available paths by doing
>>>> "show
>>>>> route forwarding-table destination <prefix>", but is there a way to
>>>>> determine which path a particular flow is using?
>>>>> 
>>>>> For those of you familiar with Cisco, I'm looking for an equivalent to
>>>>> "show cef exact-route".
>>>> 
>>>> _______________________________________________
>>>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>>>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>> 
>>> 
>>> 
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


More information about the juniper-nsp mailing list