[c-nsp] BGP multipath load balancing.. broken sessions upon hash change

Pete Lumbis alumbis at gmail.com
Thu Sep 3 17:02:08 EDT 2015


What you need is resilient hashing, which is supported on the Broadcom
Trident 2 chipset by all the vendors that use it (Nexus 3k, Arista
platforms, Dell S4048/S6000 with Cumulus Linux). I'm not aware of Cisco
custom chips that do this.

The way resilient hashing works is that it pre-populates a large number of
buckets, say 1024 and then takes your list of next hops and just repeats
them.

A, B, C, D, A, B, C, D, A, B, C, D....

If a next hop fails, it just plugs in the hole with the still living next
hops. Say B fails.

A, *A*, C, D, A, *C*, C, D, A, *D*, C, D....

Anything that was going to B dies anyway, but you don't have to re-shuffle
the existing buckets.

The downside is that if you add a new nexthop you have to shuffle again,
but you get what you pay for :)


-Pete

On Wed, Sep 2, 2015 at 4:49 PM, Peter Kranz <pkranz at unwiredltd.com> wrote:

> I’m using bgp maximum-paths and several peers announcing the same /32 to
> create a poor man’s load balancer. This works well with up to 16 peers
> after
> which the CEF number of buckets is exceeded.
>
> However, if the number of connected peers change, all sessions break, which
> I would like to avoid.
>
> For example:
> - 10 machines are advertising a path to the /32
> - SSH is opened to one machine via the advertised IP address
> - 1 machine stops advertising, bringing the pool to 9
> - SSH connection breaks a little while later
>
>  Conversely when adding another machine to the pool, a similar experience:
> - 9 machines are advertising a path to the /32
> - SSH is opened to one machine via the advertised IP address
> - 1 machines starts advertising, bringing the pool to 10
> - SSH connection breaks immediately
>
> Is there a solution to keep the client session sticky to the BGP peer it
> was
> initially started on? I am using per-destination load balancing. My
> suspicion is that upon a change in the number of connected peers, the CEF
> hash buckets are reset and renumbered, breaking all connections.
>
> Peter Kranz
> www.UnwiredLtd.com
> Desk: 510-868-1614 x100
> Mobile: 510-207-0000
> pkranz at unwiredltd.com
>
>
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>


More information about the cisco-nsp mailing list