[c-nsp] FIB insertion issues on Sup2T routers
alumbis at gmail.com
Thu Jan 4 19:08:21 EST 2018
My memory on this is old and fuzzy, but I worked on some issues when I was
in TAC where the TCAM on sup2t isn't fixed like the old sup720. It's not a
guaranteed number of entries, and is dependent on the space the fib data
structure takes up. The data structure is entirely dependent on the
add/remove of routes and prefix distribution over time. We had a large
provider who kept hitting tcam exception at like 40% of stated maximums.
Software fixes improved the data structure for them, but I wouldn't be
surprised if there are still a number of instances where the prefix
distribution + add/remove makes things worse. Larger tables increases the
likelihood of having problems. Lots of add/removes (i.e., internet facing)
increases the likelihood of problems.
Rebooting the card/box would rebuild the data structure from scratch,
likely resulting in a more optimized version.
I have no idea how to verify if this is the issue you're hitting, but an
anecdote that sounds similar.
On Wed, Jan 3, 2018 at 3:58 PM, Paul <paul at gtcomm.net> wrote:
> Seems to be a bug in many versions of code, we've had it happen on
> numerous sup2t devices, different random line cards and code versions all
> the way from the original sup2t codetrain up until SY5.
> It seems to MOSTLY happen on 6908XL modules that I can tell. One router it
> happened after many years uptime. I suspect it's something going on with
> BGP injections lately on the internet, most definitely not reaching the
> capacity of the device but rather data contained inside could be giving
> false positives (a bug!).
> Upgraded to SY10 and haven't seen it since, but that doesn't mean it's not
> still there.
> Never had it happen on non-XL, and never had it happen on the Superivisor
> 2t itself, only the line cards. Already pinged cisco about it with no
> avail, even asked to simply put in a "if tcam exception happens on a line
> card (and only that card), reboot that line card automatically" option ,
> but you know how they are.
> Hoping that it was fixed in SY10+ (some caveats related to 6908 listed
> there that are similar in nature to this)
> On 1/2/2018 9:11 AM, Jeroen van Ingen wrote:
>> Never feels good when you can't find a good explanation / do proper RCA
>> for an incident...
>> And you don't have any active support on the boxes? If you hit it again,
>> let us know; we're running a couple of Sup2T-XL too.
>> Jeroen van Ingen
>> ICT Service Centre
>> University of Twente, P.O.Box 217, 7500 AE Enschede, The Netherlands
>> On 02-01-18 14:51, "Rolf Hanßen" wrote:
>>> on router #1 it happened again.
>>> We then updated it to 15.2(1)SY5 (put luck) on Dec 6th and configured
>>> prefix limits on all sessions allowing less tha 100k above current count.
>>> On router #2 we did nothing.
>>> Router #3 was false positive, issue did not occur at all (human error).
>>> Nothing happened since the updates, no insertion issue, no prefix count
>>> So we have no clue what happened.
>>> kind regards
>>>> I had 3 incidents within a week in which Sup2T-XL routers switched to
>>>>> software forwarding.
>>>>> I.e. log says:
>>>>> %MLSCEF-4-FIB_TCAM_INSERT_FAIL: FIB entry insertion into tcam failed,
>>>>> IPv4 route may be absent from hardware table
>>>> Haven't seen this one, but I'm interested to hear whether you've had new
>>>> occurrences... We're running newer code though, 15.2(1)SY5 currently,
>>>> because of several bugs in earlier releases.
>>>> Jeroen van Ingen
>>>> ICT Service Centre
>>>> University of Twente, P.O.Box 217, 7500 AE Enschede, The Netherlands
>> cisco-nsp mailing list cisco-nsp at puck.nether.net
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> archive at http://puck.nether.net/pipermail/cisco-nsp/
More information about the cisco-nsp