[c-nsp] RPKI extended-community RFC8097

Jakob Heitz (jheitz) jheitz at cisco.com
Fri Dec 18 15:00:02 EST 2020


There is an issue with route-maps.

Testing the RPKI validity in route-map causes BGP REFRESH messages.
Lots of them.
soft-reconfig helps, but that causes risk of memory exhaustion and does
not fix the internal CPU usage of evaluating the needed route-maps.
(soft-reconfig saves a copy of all pre-policy incoming routes).
Using the validity in the bestpath computation does not cause REFRESHes.

Consider if validity is used in a route-map and the router drops a route.
When a ROA update comes from the validator, then the route-map needs
to be re-evaluated to determine if the route can now be used and what
sets need to be made to it.
For all routes, not just one.
Because when a route is dropped, all information about which route was
dropped is lost.
The REFRESH must go to any router that has dropped at least one incoming
route and that tests RPKI validity in its route-map.

We have some work going on in XR to reduce the impact of the REFRESH
messages and to reduce the risk of memory exhaustion when using soft-reconfig.

When validity is used in bestpath computation, invalids are not actually
dropped. They are made unfeasible. It's effectively dropped, but can
come back from the dead after a ROA update from the validator.
Importantly, the route-map has already run on it and does not need to run
again when the validity changes. Thus no REFRESH.

Regards,
Jakob.

-----Original Message-----
From: Ben Maddison <benm at workonline.africa> 
Sent: Friday, December 18, 2020 12:26 AM
To: Jakob Heitz (jheitz) <jheitz at cisco.com>
Cc: Lukas Tribus <lukas at ltri.eu>; Mark Tinka <mark.tinka at seacom.mu>; Cisco-nsp <cisco-nsp at puck.nether.net>
Subject: Re: [c-nsp] RPKI extended-community RFC8097

Hi Jakob,

On 12/18, Jakob Heitz (jheitz) wrote:
> Hi Lukas, Mark, Ben,
> 
> The default bestpath prefix-validate behavior treats invalid routes
> as unfeasible and prefers valid routes over not-found.
> 
> The default bestpath prefix-validate behavior cannot be used unless
> all paths of a net have the correct RPKI validity. That can only
> happen if all EBGP sessions into an AS validate their incoming
> routes and apply the RFC8097 extended community.

And, iff all ASBRs have a consistent set of VRPs from the validation
caches, which is a very fragile assumption, and the root of most of the
impact we've seen from this.

> If these conditions are not satisfied, then you cannot use the
> bestpath prefix-validate behavior and you must use
> route-maps to process the RPKI validity, like this:
> 
> router bgp ...
>  bgp rpki server tcp [...]
>  address-family ipv4
>   bgp bestpath prefix-validate disable
> [...]
> route-map RM_EBGP_IN deny 10
>  match rpki invalid
> [...]
> 
> I have a proposal to improve the bestpath prefix-validate behavior
> to better match how most operators use it. By a new configuration,
> I would treat valid and not-found with the same preference. Invalid
> would continue to be unfeasible. Then, a received IBGP route without
> the RFC8097 community will be fine.
> 
> Thoughts?
> 
That certainly sounds like an improvement (a large one), but it doesn't go far enough
for me.

The router should not *act* on validation status unless told to by the
operator at all.
I would suggest that the 'bgp bestpath prefix-validate ...' commands be
deprecated altogether, and be replaced with a single per-afi/safi
command that simply enables rov-checking (i.e. records the status in the
RIB, but takes no policy action).
Everything else can be done in a route-map.

I have heard the argument from other vendors that "operators want a
single command to apply, without touching routing policy".
I think this is false. At least in 2020, you would be very hard pressed
to find an operator doing ROV, but scared of writing a non-trivial
routing policy.

Getting from where you are today, to what I'm advocating obviously means
changing existing behaviors, which can be an unwelcome surprise.

I would suggest the following strategy:
1.  Introduce a new address-family mode command (something like 'bgp
    rpki-validate origin' to provide syntax space for aspa and others in
    future).
    Make it a no-op if 'bgp bestpath prefix-validate' is enabled, otherwise
    make it enable rov state checking as above.
2.  Start issuing a warning on the CLI, in logs, etc, when 'bgp bestpath prefix-validate'
    is used.
3.  After some time make 'bgp bestpath prefix-validate' a no-op hidden
    command.
4.  After some more time, error if 'bgp bestpath prefix-validate' is
    used.

Hope that helps.

Cheers,

Ben


More information about the cisco-nsp mailing list