[c-nsp] BGP Load Sharing

Anton Kapela tkapela at gmail.com
Fri Apr 2 09:55:39 EDT 2010


On Apr 2, 2010, at 8:08 AM, Bunny Singh wrote:

> I am using multihoming for my BGP with two ISP's. I have 10 mbps from one and 4 mbps from other ISP connecting to my single 3660 router. I am getting default route from my both of the ISP's

Default only? Not enough routes! I'd recommend spicing it up by asking for default + peer routes, or default + peer + customer routes. Or, heck, if these upstream "providers" support communities, request "full feed + default" -- and then simply match & filter based on community tags upon reception. Even if they don't support communities, you could still request a full view & hack up input filters to your liking.

Also, the 3660 'cpu' is roughly an npe-225 equivalent (if not identical parts, even), thus, no slouch and will not be terribly bothered crunching some semi-involved route-map policies. Also, it can hold 256 megs of dram. If you've not already done so, upgrade this box to its max (it will likely be the cheapest upgrade you ever get this much out of), and have it hold more routes.

Moving on -- it may not be obvious, but you needn't hold the "full" table of routes for your endpoint on the net to make better decisions about which upstream to install a route for a particular destination--even holding 60 to 70% of the table can prove useful.

Here's one policy which has been working on 256m or other low-fib/low-mem boxes for my clients, which permits /23 and longer out to 3 AS hops away, and permits only /22 and shorter for everything else.

ip prefix-list hackslash seq 10 permit 0.0.0.0/0 le 22

ip prefix-list longs seq 5 permit 0.0.0.0/0 ge 23

ip as-path access-list 10 permit (^_[0-9]+$|^_[0-9]+_[0-9]+$|^_[0-9]+_[0-9]+_[0-9]+$)

route-map transit-in permit 5
match ip address prefix-list longs
match as-path 10

route-map transit-in permit 10
match ip address prefix-list hackslash

You should also search this list (and nanog) for "prefix filter low memory" and other posts from ~2006/2007 era, when folks were crossing the 239k/256k tcam exhaustion thresholds and/or their 256m dram threshold. For example, one way to hack-slash a RIB/FIB is via the ISP-strict /8 boundary RIR allocation based filter ruleset:

ftp://ftp-eng.cisco.com/cons/isp/security/Ingress-Prefix-Filter-Templates

Perhaps the only downside to per-/8 filtering (plus exceptions) is the need for semi-frequent udpates to the list, as /8's are allocated by IANA and/or when RIR's change their policies.

> and advertising /24 Public pool towards both of the ISP's. For load sharing i am doing path prepending and put a weight for outbound traffic but not getting loadshare as i want.

To get "ecmp" from two 0/0 routes, which is about all I can think would work in this scenario, you will need to enable "max-paths" in your bgp config, like so:

(config)#router bgp 64512
(config-router)#maximum-paths 2 

...then the 0/0's from your upstreams, all other things being equal (i.e. set the same metric + lp upon receiving them from your providers), will be placed into your FIB as to ECMP-able 0/0 routes.

The routers normal CEF forwarding logic will then do a src+dst IP address hash, and determine a next-hop IP address based upon the results. This isn't 'optimal' in the sense that a customer of upstream provider A would be sent via a (potentially worse) path via provider B, but it'd sure guarantee your upstream traffic distribution would be fairly equal, assuming your userbase was not fixated on a subnet of the possible internet destination addresses.

If you are running a 12.4T image, you can also enable cef load-sharing full, which uses src+dst IP addresses, as well as src+dest port numbers of TCP and UDP packets to aid in making a more uniform distribution of next-hop selection. This may confound your users, however, as things like TCP and UDP traceroutes will expose 'both' paths from probe to probe packet. In my experience, customers don't mind/notice l4 ecmp on intra-network paths, but on inter-network paths, it seems to generate complaints. 

> Now i want to do the loadsharing and want to use near to  14 mbps. So is it anyway to do the loadsharing for outbound as well as inbound traffic. I have searched in google but didnot get perfect answer. 

CEF ecmp will only get a statistically 50/50 split based on src+dst hash, so you'd never see a perfect 10+4; instead, you'd see the 4mbit link pushed to the ceiling and the 10 megabit link would be unlikely filled.

The best solution here is to specifically ask for "full table + default," use route maps to filter and express different import policies, and tweak said policies to obtain a reasonable upstream split. 

-Tk


More information about the cisco-nsp mailing list