[c-nsp] Nexus 7707 as Internet Edge Router?

Thu Jul 27 04:18:06 EDT 2017

On 27/Jul/17 00:42, James Jun wrote:

> As I understand it, on MX, if you configure a say a 2 Gbps policer on a LAG (ae)
> interface, each member interface will receive pro-rata share of CIR to meet 
> aggregate rate of 2 Gbps across the whole LAG.
>
> On XR platform, by default, application of 2 Gbps policer on Bundle-Ether will
> replicate the same 2 Gbps on every member 10G interface -- this may or may not
> give you the desired effect you'd expect.
>
> Neverthless, this is a moot point now.  As of IOXR 6.0.1, Aggregate Bundled QoS
> is now supported:
>
>   "Aggregated Bundle QoS feature allows the shape, bandwidth, police rates and 
>   burst values to be distributed between the active members of a bundle, where
>   a QoS policy-map is applied."
>  
>   P/0/RSP0/CPU0:ASR9000(config)# hw-module all qos-mode bundle-qos-aggregate-mode

Very interesting.

As we don't use the ASR9000 for edge routing, I didn't know the feature
had made it to this platform.

I know it has just arrived to IOS XE, which means it has been available
on the ASR1000 platform for a little while now. We enabled it over
there, but found a bug where the router won't export flows when the
feature is enabled. We just tested engineering code yesterday that
confirms a fix, so that is good.

Has anyone actually tested this on the ASR9000 to know if it works as on
the MX? Online documentation is not very clear about how it's setup,
operation options, e.t.c.

> My biggest problem of ASR9K on revenue boxes is that it takes a rocket science
> to SW upgrade the platform if you care about downtime duration.  Single homed
> customers are very sensitive to multi-hour maintenance window, and then there
> is also that FPD upgrade that you have to run on every line card and reload each.
>
> When a TAC confirmed bug behavior is noted and software upgrade is advised, it's
> a hassle to get change window approved due to the complexities of SW jump.

Yep, big issue.

Also, for some reason, every time we've upgraded code on our CRS boxes,
we end up with some kind of RP, FP or PLIM issue and/or failure. In some
cases, we've had to RMA units, and in others, have had to re-seat line
cards, e.t.c. We are now on IOS XR 6.1.3 for the CRS, which is going well.

To be fair, we recently also saw RE failures when upgrading our MX
platforms to Junos 16.2. But a simple Junos recovery on the failed RE
was the fix, and not a hardware replacement.

> On the flip side, on MX platforms, I'm not a big fan of Juniper BGP implementation
> myself IMO.
>
> Performance seems to be improving noticeably as of recent SW versions, and BGP
> is all around snappy on the new 64-bit MX RE, but update-group behavior handling
> seems awful.
>
> Why is it that when you de-configure the last remaining peer on a customer facing
> peer-group, it resets all 175 BGP sessions on the entire chassis when you commit?
> May be I'm doing something wrong, but best practice seems to be to configure a fake
> BGP peer to force rpd to operate differently.
>
>   https://www.juniper.net/documentation/en_US/junos/topics/topic-map/bgp-sessions.html
>
>    "When, because of a configuration change, BGP transitions from needing two copies
>    of a route to not needing two copies of a route (or the reverse), all sessions 
>    over which VPN routes are exchanged go down and then come back up."
>
>   "The way to prevent these unnecessary session flaps is to configure an extra RR 
>    client or EBGP session as a passive session with a neighbor address that does 
>    not exist. This example focuses on the EBGP case, but the same workaround works  
>    for the RR case."

Does this only affect VPN routes, or even Global? I've not seen this
issue on our side. But then again, we don't run Internet in a VRF, if
this is your scenario.

Mark.