[c-nsp] ASR9k: RIB/FIB convergence

Thomas Schmid schmid at dfn.de
Thu Aug 2 05:13:32 EDT 2018


Hi all,

sort of a heads up ... 

I'd be interested to hear if, and under which circumstances others are seeing this behavior,
since the root cause is still unknown.

In the beginning there were some anecdotical complaints
by customers that they experienced persistent reachability problems to some destinations
when we did a scheduled maintenance in our network somewhere else. Further 
investigations pointed to routing inconsistencies during large RIB changes. 

To give you some numbers: we found out that in our environment processing 70k BGP changes 
takes 2-3 min to write the updates to FIB, 700k routes takes 20-30 min!!

During that period, RIB and FIB are not consistent with all the nasty consequences: 
blackholing, routing loops etc.

Convergence time seems to be somehow related to the number of eBGP sessions on the
box. On routers with less than 200 sessions, convergence time looks ok, from 300+
sessions on, things get bad.

This affects both XR 5.3.3, 6.2.3 and Typhoon, Tomahawk linecards. 

TAC/BU are currently working on this, but they have a hard time to find out what's
going wrong here. Processing the updates on the RP takes less than 1s,
but writing the updates to the LC takes forever ...

Thanks,

   Thomas



More information about the cisco-nsp mailing list