[c-nsp] 6500 series, FIB exception @ dfc

Igor Smolov igor.smolovr at gmail.com
Thu May 2 12:54:59 EDT 2019


Hi all,

We've ran into a serious bug the other day with our Cat6504 with S2T.
This machine has 3 upstreams with full views.  Wanted to get a
feedback from the list on what could it be and how to mitigate it.

So while running s2t54-adventerprisek9-mz.SPA.152-1.SY5.bin, with over
6 months uptime
the switch threw in the following:

%CFIB-DFC2-7-CFIB_EXCEPTION: FIB TCAM exception, Some entries will be
software switched routing issues @ DFC, S2T seems to be performing
correctly

#sh platform hardware cef exception status
Current IPv4 FIB exception state = FALSE
Current IPv6 FIB exception state = FALSE
Current MPLS FIB exception state = FALSE
Current EoM/VPLS FIB TCAM exception state = FALSE

#remote command mod 2 sh platform hard cef exception status detail
Current IPv4 FIB exception state = TRUE
...

#remote command mod 2 sh platform hardware cef resource-level
Global watermarks: apply to Fib shared area only.
Protocol watermarks: apply to protocols with non-default max-routes

Fib-size: 1024k (1048576), shared-size: 1016k (1040384), shared-usage:
877k(898387)

Global watermarks:
                    Red_WM: 95%,   Greem_WM: 80%,   Current usage: 86%

Protocol watermarks:

 Protocol           Red_WM(%)      Green_WM(%)     Current(%)
 --------           ---------      ----------      ----------
 IPV4                --             --              73% (of shared)
 IPV4-MCAST          --             --              0 % (of shared)
 IPV6                --             --              12% (of shared)
 IPV6-MCAST          --             --              0 % (of shared)
 MPLS                --             --              0 % (of shared)
 EoMPLS              --             --              0 % (of shared)
 VPLS-IPV4-MCAST     --             --              0 % (of shared)
 VPLS-IPV6-MCAST     --             --              0 % (of shared)

#remote command mod 2 show platform hardware cef maximum-routes usage


 Fib-size: 1024k (1048576),     shared-size: 1016k (1040384),
shared-usage: 874k(895227)

 Protocol         Max-routes     Usage              Usage-from-shared
 -------         ----------     -----              -----------------
 IPV4             1017k          762763 (744 k)       761739 (743 k)
 IPV4-MCAST       1017k          6      (0   k)       0      (0   k)
 IPV6             1017k          134512 (131 k)       133488 (130 k)
 IPV6-MCAST       1017k          4      (0   k)       0      (0   k)
 MPLS             1017k          1      (0   k)       0      (0   k)
 EoMPLS           1017k          1      (0   k)       0      (0   k)
 VPLS-IPV4-MCAST  1017k          0      (0   k)       0      (0   k)
 VPLS-IPV6-MCAST  1017k          0      (0   k)       0      (0   k)

Maximum Tcam Routes : 901021
Current Tcam Routes : 897288


The box did not hit any TCAM limits; Usage below the red watermark.
Message comes from a
DFC card, similar to this bug:
https://quickview.cloudapps.cisco.com/quickview/bug/CSCun81101


Right after this error we performed card reseat & IOS upgrade.  Now it's running

Cisco IOS Software, s2t54 Software (s2t54-ADVENTERPRISEK9-M), Version 15.5(1)SY,
RELEASE SOFTWARE (fc6)
System image file is "bootdisk:s2t54-adventerprisek9-mz.SPA.155-1.SY.bin"

After a short while, we get an identical error:

%CFIB-DFC2-7-CFIB_EXCEPTION: FIB TCAM exception, Some entries will be
software switched

Previously, the box was still able to switch (process) packets, but
this time it's all froze,
all traffic was just dropped.

#sh inventory

NAME: "WS-C6504-E", DESCR: "Cisco Systems Cisco 6500 4-slot Chassis System"
PID: WS-C6504-E        ,                     VID: V01, SN: xxx

NAME: "1", DESCR: "VS-SUP2T-10G 5 ports Supervisor Engine 2T 10GE w/
CTS Rev. 1.8"
PID: VS-SUP2T-10G      ,                     VID: V05, SN: xxx

NAME: "msfc sub-module of 1", DESCR: "VS-F6K-MSFC5 CPU Daughterboard Rev. 1.6"
PID: VS-F6K-MSFC5      ,                     VID:    , SN: xxx

NAME: "VS-F6K-PFC4XL Policy Feature Card 4 EARL 1 sub-module of 1", DESCR:
"VS-F6K-PFC4XL Policy Feature Card 4 Rev. 1.0"
PID: VS-F6K-PFC4XL     ,                     VID: V01, SN: xxx

NAME: "2", DESCR: "WS-X6848-SFP CEF720 48 port 1000mb SFP Rev. 3.1"
PID: WS-X6848-SFP      ,                     VID: V02, SN: xxx

NAME: "WS-F6K-DFC4-AXL Distributed Forwarding Card 4 EARL 1 sub-module of 2",
DESCR: "WS-F6K-DFC4-AXL Distributed Forwarding Card 4 Rev. 2.0"
PID: WS-F6K-DFC4-AXL   ,                     VID: V04, SN: xxx

Qs:

1. What version of IOS has a fix for this bug?

2. If you encountered this bug, which cards/models were you using?

3. If these BGP peers are connected directly to VS-SUP2T card, is it correct to
assume this bug is not an issue? These TCAM entries won't affect the DFC card?

4. Since we are utilizing over 70% of TCAM, what is the recommended hardware
platform to move? 1M IPv4 routes are just around the corner... SUP6T-XL has the
same 1024K limitation. Anything capable of 10gb+, over 1M routes, 1 to 5U?

Cheers,
Igor Smolov


More information about the cisco-nsp mailing list