[c-nsp] WRR Confusion on 6748 blades

Wed Jun 27 12:38:47 EDT 2012

Hi John, please see inline below:

At 08:58 AM 6/27/2012, John Neiberger pronounced:
>On Wed, Jun 27, 2012 at 8:24 AM, Janez Novak <jnovak123 at gmail.com> wrote:
> > 6748 can't do shaping. Would love to have them do that. So you must be
> > experiencing drops somewhere else and not from WRR BW settings or WRED
> > settings. They both kick in when congestion is happening (queues are
> > filling up). For exaple linecard is oversubscribed etc
> >
> > Look at second bullet
> > 
> (http://www.cisco.com/en/US/docs/routers/7600/ios/12.2SR/configuration/guide/qos.html#wp1728810).
> >
> > Kind regards,
> > Bostjan
>
>This is very confusing and I'm getting a lot of conflicting
>information. I've been told by three Cisco engineers that these queue
>bandwidths limits

queue-limit and bandwidth values (ratios/weights) are *different* things.

The queue-limit physically sizes the queue. It says how much of the 
total physical buffer on the port is set aside exclusively for each 
class (where class is based on DSCP or COS). Traffic from other 
classes can NEVER get access to the buffer set aside for another 
class, ie, there could be plenty of available buffer in other queues 
even as you're dropping traffic in one of the queues.

The bandwidth ratios, on the other hand, determine how frequently 
each of those queues is serviced, ie, how often the scheduler will 
dequeue/transmit a frame from the queue. If there is nothing sitting 
in one queue, other queues can get access to that bandwidth, ie, 
"bandwidth" is not a hard limit, you can think of it as a minimum 
guarantee when there is congestion/contention.

>are fairly hard limits. That is in line with what we
>were experiencing because we were seeing output queue drops when the
>interface was not fully utilized. Increasing the queue bandwidth got
>rid of the output queue drops.

What this should be doing is just causing us to service the queue 
more frequently. That could certainly reduce/eliminate drops in the 
event of congestion, but only if there is traffic in the other queues 
that is also contending for the bandwidth.

In other words, if there is only one active queue (ie only one queue 
has traffic in it), then it can & should get full unrestricted access 
to the entire link bandwidth. Can you confirm whether there's traffic 
in the other queues?

>For one particular application
>traversing this link, that resulted in a file transfer rate increase
>from 2.5 MB/s to 25 MB/s. That's a really huge difference and all we
>did was increase the allocated queue bandwidth. At no point was that
>link overutilized.

We frequently see 'microburst' situations where the avg rate measured 
over 30sec etc is well under rate, but at some instantaneous moment 
there is a burst that exceeds line rate and can cause drops if the 
queue is not deep enough. Having a low bandwidth ratio, with traffic 
present in other queues, is another form of the queue not being deep 
enough, ie, the queue may have a lot of space but if packets are not 
dequeued frequently enough that queue can still fill & drop.

>In fact, during our testing of that particular
>application, the link output never went above 350 Mbps. We used very
>large files so that the transfer would take a while and we'd get a
>good feel for what was happening. Doing nothing but increasing the
>queue bandwidth fixed the problem there and has fixed the same sort of
>issue elsewhere.

This suggests to me that there is traffic in other queues contending 
for the available bandwidth, and that there's periodically 
instantaneous congestion. Alternatively you could try sizing this 
queue bigger and using the original bandwidth ratio. Or a combination 
of those two (tweaking both bandwidth & queue-limit).

Is there some issue with changing the bandwidth ratio on this queue 
(ie, are you seeing collateral damage)? Else, seems like you've 
solved the problem already ;)

Hope that helps,
Tim

>I'm still researching this and trying to get to the bottom of it. I
>think we're missing something important that would make this all make
>more sense. I appreciate everyone's help!
>
>John
>_______________________________________________
>cisco-nsp mailing list  cisco-nsp at puck.nether.net
>https://puck.nether.net/mailman/listinfo/cisco-nsp
>archive at http://puck.nether.net/pipermail/cisco-nsp/

Tim Stevenson, tstevens at cisco.com
Routing & Switching CCIE #5561
Distinguished Technical Marketing Engineer, Cisco Nexus 7000
Cisco - http://www.cisco.com
IP Phone: 408-526-6759
********************************************************
The contents of this message may be *Cisco Confidential*
and are intended for the specified recipients only.