[nsp] 3550 tacacs bug (was bug in 12.1(20)EA1a 3550)

Mark Boolootian booloo at ucsc.edu
Mon May 3 16:58:20 EDT 2004


The packet loss problem I posted about last Friday is a result of the
bug that Yuval Ben-Ari pointed to (CSCee13768).  The problem is that
tacacs sessions apparently fail to close properly and end up sitting in
CLOSEWAIT state indefinitely.  On the 3550, after a sufficient number of
TCBs pile up, the router starts dropping packets that touch the route
processor.  This bug has apparently been around in the 3550 as far back
as 12.1(11)EA1.

We run rancid hourly, which does a good job of building up the TCBs over
the course of a few days, and explains why most our 3550s failed in
lockstep.  Switching to a local username for rancid fixed that.

This problem exists in other platforms.  We've seen it in our 6500s
(running 12.1(22)E1, 12.2(14)SX1, and 12.2(17b)SXA), though it hasn't
caused any (obvious) problems so far.  Cisco has two others DDTS's related
to this problem: CSCea08706 and CSCed65285.

You can check for the presence of these sessions with the command:

  show tcp brief

The typical view of a broken system looks like this:

   TCB       Local Address           Foreign Address        (state)
   4525A58C  isp-g-GE1-1.ucsc.22384  beacon.ucsc.edu.49     CLOSEWAIT
   48C72184  isp-g-GE1-1.ucsc.23282  beacon.ucsc.edu.49     CLOSEWAIT
   45D366A8  isp-g-GE1-1.ucsc.22897  beacon.ucsc.edu.49     CLOSEWAIT
   5365A710  isp-g-GE1-1.ucsc.22833  beacon.ucsc.edu.49     CLOSEWAIT
   552B8F74  isp-g-GE1-1.ucsc.22770  beacon.ucsc.edu.49     CLOSEWAIT
   48D221C0  isp-g-GE1-1.ucsc.22706  beacon.ucsc.edu.49     CLOSEWAIT
   520DE9C0  isp-g-GE1-1.ucsc.22579  beacon.ucsc.edu.49     CLOSEWAIT
   46C7D7F4  isp-g-GE1-1.ucsc.22515  beacon.ucsc.edu.49     CLOSEWAIT
   52D98AC4  isp-g-GE1-1.ucsc.22451  beacon.ucsc.edu.49     CLOSEWAIT

We were able to clear this state with the command:

  clear tcp tcb *

This will wipe out any active TCP connections (like your BGP sessions).  When
the state disappears on the 3550s, so does the packet loss.  Note that it
takes several minutes before the state is removed.

A million thanks to Yuval for solving this mystery for us.  

mb


More information about the cisco-nsp mailing list