[c-nsp] RPKI extended-community RFC8097

Lukas Tribus lukas at ltri.eu
Sat Dec 19 08:02:00 EST 2020


Hello Jakob,


On Fri, 18 Dec 2020 at 07:58, Jakob Heitz (jheitz) <jheitz at cisco.com> wrote:
>
> Hi Lukas, Mark, Ben,
>
> The default bestpath prefix-validate behavior treats invalid routes
> as unfeasible and prefers valid routes over not-found.
>
> The default bestpath prefix-validate behavior cannot be used unless
> all paths of a net have the correct RPKI validity. That can only
> happen if all EBGP sessions into an AS validate their incoming
> routes and apply the RFC8097 extended community.

First of all let me say that I appreciate your effort to improve the
situation very much, thank you for this.

The problem is that this is not a "default". If this would be a
default, we could just disable it. However we can only disable it by
disabling validation, which is incorrect according to the
documentation (see below).

I understand that this works just fine in a greenfield deployment
(just deploy the network with everything enabled), but I never
understood how this implementation should work in existing networks.

An existing network would need to enable validation on all routers
SIMULTANEOUSLY. If this is a small shop, let's say 5 routers, maybe
they could bypass change management procedures, get both engineers in
a room, each engineer enables RPKI validation on 2 routers as fast as
possible, and maybe they could get Janice from Accounting to help on
the fifth terminal for Router nr 5 ("just hit the Enter button,
Janice") on the terminal of the fifth router. The deployment would be
quasi-simultaneously and routing anomalies would be limited to a short
period of time.

But you cannot make a configuration change atomically across an entire
backbone, and also even if this would be technically possible, nobody
would ever approve this change because the risk is just too great.

"We will shutdown our entire fully redundant and geographically
diverse AS for 2 hours on Saturday as we deploy RPKI ROV" <-- that's
not how SP's deploy new features.

Network operators need to be able to roll this out over a period of
time and rollback the configuration on individual nodes, based on the
operational requirements.

I understand that there are configuration knobs that impact best path
selection that will have such effects, like the bgp best-path med
knobs. That doesn't mean RPKI ROV has to be.



> If these conditions are not satisfied, then you cannot use the
> bestpath prefix-validate behavior and you must use
> route-maps to process the RPKI validity, like this:
>
> router bgp ...
>  bgp rpki server tcp [...]
>  address-family ipv4
>   bgp bestpath prefix-validate disable
> [...]
> route-map RM_EBGP_IN deny 10
>  match rpki invalid
> [...]

Which also disables any visibility into the RPKI status in the routing
table. At least we can drop RPKI invalids this way, although it does
come with the complete lack of visibility. Also there is the issue you
mention in your subsequent email (router refresh, CPU load, memory
exhaustion with soft-reconfig inbound always).



> I have a proposal to improve the bestpath prefix-validate behavior
> to better match how most operators use it. By a new configuration,
> I would treat valid and not-found with the same preference. Invalid
> would continue to be unfeasible. Then, a received IBGP route without
> the RFC8097 community will be fine.


All those years the documentation already claims that this should work
by just allowing invalids [1]:

> - You can completely disable the validation of prefixes [...]
> - You can allow an invalid prefix to be used as the BGP best path [...]
> [...]
> During BGP best path selection, the default behavior, ***if neither of the
> above options is configured***, is that the system will prefer prefixes in
> the following order:

It's just not actually implemented this way.

If you want to introduce a new configuration knob, please just do it
EXACTLY like IOS-XR: if the new code path is used, there needs to be
NO best path intervention, NONE whatsoever, not even for INVALIDs.

XR only modifies the bestpath when enabling the (afaik non-default) option:

bgp bestpath origin-as use validity behavior

It will not change anything based on the RPKI validation status
without this option enabled, not even for invalids.


Let's not introduce yet another bandaid change for one single specific
routing anomalie, that still doesn't fully disable best path
intervention. Let's get it right this time.



Thank you!

Lukas


[1] https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-17/irg-xe-17-book/bgp-origin-as-validation.html


More information about the cisco-nsp mailing list