[c-nsp] portchannel & dcef?

Rodney Dunn rodunn at cisco.com
Thu Dec 16 17:30:02 EST 2004


On Tue, Dec 07, 2004 at 09:56:40AM -0500, Jon Lewis wrote:
> On Wed, 1 Dec 2004, Jon Lewis wrote:
> 
> > On Wed, 1 Dec 2004, Rodney Dunn wrote:
> >
> > > It shouldn't be process switched but rather CEF switched on the RSP.
> > > That's why the route-cache line increments there.  We don't support
> > > ip accounting in the dCEF path (use netflow) so that's why the
> > > traffic was getting punted to the RSP for switching.
> >
> > Well, there seems to be yet another counter issue, because the portchannel
> > stats said the traffic was being route cache (cef on the RSP) switched,
> > but the portchannel member FEs were saying processor switched.  Can an
> > RSP4 processor switch 50mbit/s from FE's into a POS?
> 
> I've got more strangeness to report with rsp-k91pv-mz.122-18.S5.bin.
> Yesterday, we apparently had a physical cabling failure with one of the
> FE's in the previously mentioned portchannel.  This really made a mess of
> things as the router declared the interface to be line proto down and
> removed it from the portchannel, but the switch (3550) saw nothing wrong
> and kept that FE in the portchannel.
> 
> >From what I've read, keepalives are supposed to be on by default.  If
> that's true, shouldn't the switch have noticed that the router had downed
> its end of one of the member interface and removed that interface on its
> side?

I'm not a L2 guy bug I did ask.  Apparently between a router and
a switch you have no way from the switch perspective to detect
a protocol down condition.  The router does this via the keepalive
mechanism. 

I did see someone bring up an issue a couple days ago about
two devices back to back with FE's (one of them was a 75xx) and
they did an HA switchover and the other side of the FE didn't go
down.  

CSCsa48365
Externally found moderate defect: Awaiting Info (I)
does not making interface down at the time of switchover

But that would imply since they are routers back to back
and both should be sending keepalives it's a different type issue.


> 
> Because we've had dcef induced issues recently, one of the first things I
> did before we found the bad cable was switch dcef off and back on quickly.
> For some reason, this killed dcef.  According to show run and show ip
> interface, dcef/distributed switching was enabled.  According to show int
> stat and show proc, it definitely was not...everything was being
> route-cache switched by the RSP and RSP CPU was up around 95%.  I screwed
> around with route-caching algorithms on the interfaces trying to get them
> to start doing dcef again with no effect.  I flipped dcef off/on several
> more times (conf t, ip cef, ip cef dist) with no effect.  Just before
> giving up, I decided to repeat the above, but with a longer pause between
> cef and cef dist.  That finally got dcef back and RSP CPU load down to
> normal levels.

Strange...sounds like the RSP never got a chance to turn if off on the
LC before you re-enabled it.

> 
> Two more odd things resulted from this.  First, I've started seeing
> Dec  7 05:00:32: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 06:00:45: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 07:00:59: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 08:01:11: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 09:01:24: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 10:01:37: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 11:01:50: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 12:02:03: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 13:02:17: %TFIB-7-SCANSABORTED: TFIB scan not completing.
> Dec  7 14:02:30: %TFIB-7-SCANSABORTED: TFIB scan not completing.
>

I've seen this before but in my quick searches I don't see anything
matching for 12.2S.  Give the CEF changes in 12.2(25)S it would
be good to know if it exist after that. 
 
> Second, the issue of the portchannel saying traffic is being dcef switched
> but the FE members saying its being processor switched (from show int
> stat) is gone.  Now the FE members also say the traffic is dcef switched.
> 
> FastEthernet0/1/0
>           Switching path    Pkts In   Chars In   Pkts Out  Chars Out
>                Processor     629281   62803052     263488   22747508
>              Route cache          0          0          0          0
>        Distributed cache  769990862 341809026014  563387047 147930109164
>                    Total  770620143 341871829066  563650535 147952856672
> 

That's the way we would hope it always shows up.
Since the packet swithing really isn't done on the member links it's
a bit confusing on the counters showing up.  It makes sense that
if the PC shows distributed the member links do too.

Sorry I didn't have a lot of good answers for you here.

The TFIB scan not completing is a bug.

The counters issue is a bug that it at least should be consistent.

Rodney

 
> 
> ----------------------------------------------------------------------
>  Jon Lewis                   |  I route
>  Senior Network Engineer     |  therefore you are
>  Atlantic Net                |
> _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________


More information about the cisco-nsp mailing list