[c-nsp] can't be -- output discards on 6500 gig-e

Edward Henigin ed at texas.net
Sun Aug 1 21:24:05 EDT 2004


On Thu, Jul 29, 2004 at 12:47:22PM -0700, Steve Francis said:
> If your 1 minute average is 500Mbps, then your peaks are certainly
> higher. I suspect TAC is correct.

Cisco is doing a terrible job of proving this, at least, then.

Had a situation today where we put a Sparc Solaris box on 100M full
duplex on a 6500, WS-X6248-TEL.  According to this page, that's got a
56KB output buffer:

http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a0080131086.shtml

(mostly) NFS traffic to the server, at a rate of 1Mbps - 5Mbps, peaking
around 1,300 packets/sec, we were seeing output discards of up to 200pps.
That's like 15% packet loss, on a link that's 95% unused.

We moved it to a port on a WS-X6348-RJ-45 blade.  According to the
above page, the 6348 has a 112KB output buffer.  Same traffic levels,
same discards.

The NFS server is on the same 6500 chassis, on a WS-X6408A-GBIC blade.
According to the above page, the input buffer on the gigabit port for
the NFS server is 73KB.

During all of this, according to the 'show int' output, the output queue
is ALWAYS empty.  Does it make sense that with 15% of our packets being
dropped due to the queue being FULL, that 100% of the time it appears
EMPTY?

Finally, we are able to solve the problem, after entering what appears to
be an undocumented command (at least, Cisco's search engine and Google
can't find it.)  'wrr-queue group-buffers'  TAC says that this will
turn on flow control between the backplane and the output interface.
The way I read that, this means that the input interfaces will start
buffering traffic if the output buffer is full.  In theory, I would think
that we would be at risk for seeing input discards on other interfaces,
because clearly our output interface is full 15% of the time.

After enabling that command, all output discards went away, and no input
discards have happened on any other interface.

It still doesn't make sense to me.

The bottom line is that I don't trust what I'm being told, by TAC and/or
the chassis.  I want to be able to see some sort of real-time information
on the output buffer utilization.  Beyond 'show interface' to see 'Total
output drops: ' and 'Output queue: ', I don't know any other commands
to give useful info about what's going on ('show int counter errors'
etc is all redundant to the above).

Ed


More information about the cisco-nsp mailing list