[j-nsp] MPLS VPN Load-balancing
Christian Martin
christian.martin at teliris.com
Wed Aug 12 14:13:02 EDT 2009
We made the traffic change and now the balancing is even. I am going
to change the config to label-1 payload ip (only) and then shift it
back and see if we can in fact hash on the single label and the iP
header. Hopefully this will suffice!
Cheers,
Chris
On Aug 12, 2009, at 12:50 PM, Harry Reynolds wrote:
> I think the issue is that on ABC/M-series we can either do a MPLS
> hash (up to two labels), or an IP/L4 hash. It does not seems that
> you can do both, but I read somewhere if you only do a 1 label hash
> then abc/m series can hash on the IP. Its so hard to keep this
> straight. Good thing there is only one junos else all hope would be
> lost. ;)
>
> T-series platforms with e-fpcs and MX can hash on multiple MPLS
> labels while *also* hashing on L3 and l4.
>
> This seems to jive with the docs at:
>
> http://www.juniper.net/techpubs/en_US/junos9.6/information-products/topic-collections/config-guide-policy/policy-configuring-load-balancing-based-on-mpls-labels.html
>
> Regards
>
>
>
>
>
>
>
> -----Original Message-----
> From: Christian Martin [mailto:christian.martin at teliris.com]
> Sent: Wednesday, August 12, 2009 9:25 AM
> To: Harry Reynolds
> Cc: Steven Brenchley; juniper-nsp at puck.nether.net
> Subject: Re: [j-nsp] MPLS VPN Load-balancing
>
> Thanks, Harry.
>
> I just checked our routing and noticed that the traffic was entering
> the Juniper via a transit MPLS link to another PE, so the VPN label
> is the only label on the stack due to PHP. As such, if what Steven
> mentions is true re: ABC-chip hardware, then there is no entropy as
> the VPN label is the same. We are shifting the traffic to allow it
> to ingress as IPv4 to see if that changes anything.
>
> There are several VRFs that transit the link, each with a few source
> subnets and a single destination subnet. In these cases, the
> traffic rates are low and periodic (< 500kbps every 10 minutes or
> so, spread around). The meaningful traffic is exchanged between 5
> IPs on two subnets (subnet A.1,.2,.3,.4,.5 to subnet B.
> 1,.2.,3,.4,.5). Each stream is around 6Mbps.
>
> Do you know if there is a CFEB upgrade that uses a new chip
> architecture that would support a deeper hash key?
>
> Cheers,
> Chris
>
>
> On Aug 11, 2009, at 5:27 PM, Harry Reynolds wrote:
>
>> "but one particular subnet pair is exchanging quite a bit of
>> traffic). All of
>>> the addresses are unique within our domain"
>>
>> Can you clarify the nature of you test traffic to the busy subnet in
>> question?
>>
>> 1. Number of vrf ingress interfaces?
>> 2. Number of source-Ips
>> 3. Number of destination ips w/in that subnet
>>
>> Basically, how many flows do you have heading to that busy subnet?
>>
>> IIRC, as an ingress node we would do an IP hash, and this should use
>> the incoming interface and IP S/D addresses by default. On some
>> platforms when you start adding additional hashes, such as a L4 port,
>> you may start eating into the number of IP address bits that are
>> actually hashed. Meaning, you gain port entropy but loose some
>> granularity at the IP address level. Hence these knobs have a bit of
>> variance for each person that test them, as their flow specifics may
>> or may not benefit from some combination.
>>
>> As always, the more variance and the larger the number of streams the
>> better the expected results. If there is a single pair of busy
>> speakers on that subnet that would explain things. If you had 10 (or
>> more) more or less equally active streams, and still had the results
>> below, then that would seem broken IMO. I may be wrong, but if you
>> only have 3 such streams, then each stream is independently hashed,
>> and there is a 12 % chance (.50 x .50 x.50) you will get unlucky and
>> find all three on the same link.
>>
>> HTHs.
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: juniper-nsp-bounces at puck.nether.net [mailto:juniper-nsp-bounces at puck.nether.net
>> ] On Behalf Of Christian Martin
>> Sent: Tuesday, August 11, 2009 1:58 PM
>> To: Steven Brenchley
>> Cc: juniper-nsp at puck.nether.net
>> Subject: Re: [j-nsp] MPLS VPN Load-balancing
>>
>> Steven,
>>
>> Thanks for the response. I was unaware of this limitation in the
>> ABC- chip, but I am still curious as to why the incoming traffic to
>> the PE, which should be hashed at IP only, is not properly balanced
>> across the outbound (MPLS) links. The lookup is done on the IP
>> header only, which should have enough entropy to create a reasonably
>> balanced modulus. Unless the outbound FIB entries play a role
>> somehow? I could see if this were a P and MPLS was coming in and
>> out, but it is
>> IP--->push--push---forward...
>>
>> Also note that the outer label is of course different on the two
>> links (learned via LDP).
>>
>> Cheers,
>> Chris
>>
>>
>>
>>
>>
>> On Aug 11, 2009, at 4:08 PM, Steven Brenchley wrote:
>>
>>> Hi Christian,
>>> The problem your hitting is a limitation of the M10i chip set.
>>> It can only look at the top two labels and since both top labels are
>>> the same for all this traffic it's going to look like the same flow
>>> and send it all across the same link. The only way I've been able
>>> to
>>> get a simulance of load balancing is by creating multiple LSP's
>>> between the same end points and manually push different traffic
>>> across
>>> the different LSP's. It's really clunky but there are no switches
>>> that will work around this limitation on the current M10i CFEB.
>>> If you where using a T-series, M320,M120, or MX router you don't
>>> have this limitation. They can all go deeper into the packet to
>>> determine load balance.
>>> On the a semi brighter side, on the horizon there are some new
>>> Ichip based CFEB's which will not have this limitation. I don't
>>> recall when those will be available but you could probably get a
>>> hold
>>> of your SE and get a time table from them.
>>>
>>> Steven Brenchley
>>>
>>> ===============================
>>>
>>> On Tue, Aug 11, 2009 at 3:16 PM, Christian Martin
>>> <christian.martin at teliris.com
>>>> wrote:
>>> NSP-ers,
>>>
>>> I have a Cisco---Juniper pair connected over a pair of T3 links.
>>> The Juniper acts as a PE and is pushing two labels for a specific
>>> route learned on the PE destined to a single remote PE well beyond
>>> the
>>> Cisco P. The traffic is destined to several IP addresses clustered
>>> in
>>> this subnet (sort of like 10, 11, 12, 13) and the forwarding table
>>> shows that there are two correctly installed next- hops - same VPN
>>> label, different LDP label (we have applied several different types
>>> of
>>> hashings and of course have our forwarding table export policy in
>>> place). Nevertheless, the Juniper is doing a very poor job
>>> load-balancing the traffic, and the Cisco is splitting it almost
>>> evenly. There is in fact a larger number of routes being shared
>>> across this link (about 20 or so VPN routes in different VRFs and
>>> thus
>>> different VPN labels - all sharing the same 2 LDP labels, but one
>>> particular subnet pair is exchanging quite a bit of traffic). All
>>> of
>>> the addresses are unique within our domain.
>>>
>>> Has anyone had issues with load-balancing a single subnet across an
>>> MPLS VPN link pair? Note again that this is a PE-P (J--C) problem
>>> and
>>> that the IP addresses are all arranged locally. I know Juniper are
>>> secretive about their hashing algorithm (can't lose any hero tests,
>>> can we?), but we are getting like 5:1 load share if we are lucky and
>>> are bumping up against the T3's capacity. The box is an M10i.
>>>
>>> As always, any help would be appreciated.
>>>
>>> Cheers,
>>> C
>>>
>>> show route forwarding-table destination 10.160.2.0/24
>>>
>>> Routing table: foo.inet
>>> Internet:
>>> Destination Type RtRef Next hop Type Index NhRef
>>> Netif
>>> 10.160.2.0/24 user 0 indr 262175 2
>>> ulst 262196 2
>>> Push 74 600 1
>>> t3-0/0/0.1000
>>> Push 74 632 1
>>> t3-0/0/1.1000
>>>
>>>
>>> PE-P next-hop count (all showing load-balancing in effect)
>>>
>>> show route next-hop 172.16.255.11 terse | match > | count
>>> Count: 106 lines
>>>
>>>
>>> monitor interface traffic
>>>
>>> Interface Link Input bytes (bps) Output
>>> bytes (bps)
>>> t3-0/0/0 Up 541252651233 (25667208) 691166913860
>>> (35611752)
>>> t3-0/0/1 Up 279149587856 (8737568)
>>> 24893605598 (20112)
>>>
>>>
>>> Note that the Cisco is doing 25/9 Mbps and the Juniper 35/.02.
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>>
>>>
>>>
>>> --
>>> Steven Brenchley
>>> -------------------------------------
>>> There are 10 types of people in the world those who understand
>>> binary
>>> and those who don't.
>>
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp at puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
>
More information about the juniper-nsp
mailing list