[c-nsp] need good recommendation for isp gateway nat bgp pbr

Brian Roche brian at bcctv.net
Fri May 27 09:46:45 EDT 2011


Thanks for the thorough analysis.  I share your concerns about NAT,
have managed to limit its use to residential cable modems, and plan to
completely remove it as part of our *eventual* IPv6 migrations/FTTH
strategy.

Also, setting 30 second rate intervals, traffic captures to look for
microbursts, rate-limiting on the interface, and flow control were all
part of narrowing down the problem and determining my solution is
another device.  Just for clarification, I do not NAT our higher
bandwidth (and higher paying) business modem and metro ethernet/fiber
customers - the proportionally higher traffic growth of which is what
is causing the overruns.

So I think you have convinced me to leave NAT where it is for now,
keep it isolated/minimized, and consider not using PBR on the new
device(s).  Definitely gives me more flexibility and may actually
permit me to purchase multiple devices (excellent point on the lack of
redundancy :-)

Thanks again




On Thu, May 26, 2011 at 6:28 PM, Nick Hilliard <nick at foobar.org> wrote:
> On 26/05/2011 18:38, Mark Tinka wrote:
>>
>> ... I guess my point was more about the fact that in case
>> the number of sessions were to oscillate in a much wider
>> range due to customer usage patterns, special events (think
>> the British Royal wedding, et al), e.t.c., I'd be more
>> comfortable with a box like the ASR1000 than the 7200.
>
> ok.  before proceeding any further, let's go back to the original spec:
>
> configuration:
>        - npe-g1
>        - 150Mbps transit
>        - 20k nat sessions
>        - policy routing to two upstreams
>        - trivial use of bgp
>        - hairpin in/out routing over inbound trunk
>        - overruns on trunk port
>
> So, couple of things here:
>
> 1. your use of bgp is completely trivial and is not contributing to your
> load in any way whatever.
>
> 2. Port overruns on your current trunk port at this relatively low traffic
> rate suggest that you will need to be extremely careful about taking new
> cable/fttx traffic over this port in future.  I would advise that you
> monitor this port with a very short timing interval (e.g. 30 seconds) so
> that you can see microbursts which may not otherwise show up on the standard
> 5 minute polling interval.
>
> 3. overruns are a hardware problem caused by a lack of capacity on the
> incoming interface, rather than by a shortage of cpu / resources on the
> router controller.  See the following doc for more information:
>
> https://supportforums.cisco.com/docs/DOC-2613
>
> This means that you have two choices with regard to fixing the problem:
> either you find some way of restructuring your incoming / outgoing network
> traffic so that they aren't both on the same port on your 7246, or else buy
> a border router and route your outbound traffic over that.  If you have the
> port space on the ubr, you could split out the traffic into separate vlans
> and use that to route it over separate ports on the router.  Wouldn't really
> recommend this though.
>
> 4. if are planning 500 megs transit capacity, it's likely that the number of
> NAT sessions will scale similarly.  Furthermore, unless you take drastic
> steps to remove as much nat as possible from your network now, you're not
> going to be able to do so in future because ARIN will have no ipv4 addresses
> to give you.  This means that future customer acquisition will come in as
> natted customers.
>
> Regarding future scaling:
>
> 5. if you choose PBR instead of dfz routing, I don't know how an npe-g2
> would handle that.  Personally, I wouldn't try it and would strongly
> recommend against this sort of thing on a service provider network.  This is
> an enterprise feature, not a service provider tool, and it's not really ser
>
> 6. if you scale up your nat requirements proportional to your traffic
> estimations, that suggests that you're actually going to be handling upwards
> of 60k nat sessions instead of 20k.  And if this is an average, you may well
> be hitting way more at peak times.  Worse still, if someone DoSs you from
> inside your network, then they could really trash your network. Also, NAT
> means a single point of failure on your network.
>
> 7. while an NPE-G2 will certainly handle the 500mbps traffic requirement you
> have, by adding PBR and NAT into the mix, you're creating the sort of
> scenario where an NPE-G2 will probably not really work for your.  This means
> you'll need to step up to an ASR1k if you want to stick with Ciscos and PBR
> and NAT.
>
> If I were in your position, I would:
>
> 8. remove PBR and move to a full DFZ feed from your upstreams.  PBR causes a
> CPU hit on a software router like an npe-g1/npe-g2, and if you're dealing
> with 3 transit providers, you probably want to use default free bgp routing
> anyway.  It's a lot more efficient and BGP will generally be able to make
> more sensible decisions about routing than you will.
>
> 9. take drastic steps to avoid using NAT and move to public addresses as
> fast as humanly possible so that you don't paint yourself into a nat corner
> in future.  If this isn't an option, then I would take pains to limit NAT to
> specific parts of the network so that you don't have a single giant nat box
> inline in the middle doing absolutely everything. It simply doesn't scale
> and will cause horrendous problems as your network grows.
>
> 10. get at least one and possible two border routers to talk to your
> upstreams (what if one fails?).  If you want to do just 500 mbps traffic
> with DFZ routing, an c7200/npe-g2 will work nicely.  If for some reason you
> really want to stick NAT on your border router or routers (don't do it!),
> then you'll need an ASR1k, and I would recommend that you get an ESP10 based
> box for reasons mentioned previously.  But see #9.
>
> 11. You _could_ use an asr1001 with ESP5 and use PBR + NAT as your border
> router.  It would probably work but I wouldn't recommend doing it - in any
> situation.
>
> Nick
>



More information about the cisco-nsp mailing list