[c-nsp] FIB scale on ASR9001
Mark Tinka
mark at tinka.africa
Thu Nov 11 09:01:35 EST 2021
On 11/11/21 15:43, Lukas Tribus wrote:
> For ROV to work reliably it needs to be able to reconsider previously
> rejected invalids, so I would not recommend disabling
> soft-reconfiguartion inbound, instead I'd suggest to set it to always.
If my router vendor is not automatically applying BGP policy based on
RPKI state, it shouldn't matter what ended up in RIB if I have not set
an explicit policy to act on RPKI state.
So if a previously-Invalid route now becomes a VRP, and is then good to
go for export toward a neighbor based on existing policy that matches,
why can we not leverage Route Refresh to only cater for that change?
I'm wondering if we can carry a loose relationship between the VRP
database, and RIB state.
> Of course it would be better to store ROV-rejects separately, instead
> of rejecting them. Better yet: add a new drop statement in RPL, which
> makes the route not eligible for path selection, but doesn't remove it
> from memory - this way the operator has full flexibility.
I still don't get why we need to worry about Invalid routes, if the
operator is doing ROV to drop them. As soon as they become Valid, the
RTR update will tell us that and we can allow for incremental changes
for what we ask our neighbors to accept.
> I can't
> believe I'm saying this, but a famous CPE vendor from Latvia actually
> supports this (routing filter action: reject vs discard) [1], maybe
> the XR BGP team could steal some ideas from them ;)
:-).
> I don't like to depend on optional or not commonly used BGP features
> and code paths in other people's networks for changes of any kind on
> my end.
In Juniper's case, their default to keep a copy of Adj-RIB-In had the
unintended side-effect of making ROV deployment less exciting. However,
I still feel not being able to leverage Route Refresh and avoid this
clumsiness with ROV is below par for what we can design. After all, that
was the whole point of Route Refresh...
> Memory consumption was negligible for my use-case, iirc it was 200 -
> 300 MB for 30 - 40 peers, one of which was transit (so full table -
> this was about a year ago). I believe we had this conversation before,
> and you mentioned that this could be a DoS vector, which is true but
> all the solutions that we can possibly suggest simply implement a
> "selective" soft-reconfig inbound always" anyway, so the DoS vector
> needs to be addressed in a different way, in my opinion.
Well, we had several peers disable sessions with us because their CPU
was being impacted by all the refresh messages we were sending. So we
have seen it become a DoS vector toward others, and then by extension,
toward us when we have to lose shorter paths to those peers due to the
session tear-down.
Mark.
More information about the cisco-nsp
mailing list