[c-nsp] Help with output drops

Mon Jul 13 18:17:00 EDT 2009

Hi Randy,

I can't answer why it was enabled either, the default on this platform is for QOS to be disabled until you manually enable it with the "mls qos" command. The problem you came across is why it is disabled by default so you don't have performance issues "out of the box".

When I originally replied, I was looking for the reference in the Cisco doco that tells you not to enable QOS globally if you're not going to use it, as it will degrade performance. I finally found it, so here it is for the archives (the second "Note" point is the one you want to read):
http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/qos.html#wp1750716
http://tinyurl.com/mbe65n

If you can't find the relevent section, search the above document for the string "Do not enable PFC QoS globally" and start reading from there.

QOS is used to give different treatment to different types of traffic. The classic example is that you want VoIP packets to be queued and sent before all other traffic so that your audio calls don't suffer when someone is downloading a large file which is lower priority and non real-time traffic.

AFAIK disabling mls qos globally only affects your ability to use the qos queueing/policing features and doesn't stop anything else from working. I couldn't give you a guarantee that it won't break anything else, but it is a fairly targeted command to just enable/disable qos.

regards,
Tony.

--- On Mon, 13/7/09, Randy McAnally <rsm at fast-serv.com> wrote:

> From: Randy McAnally <rsm at fast-serv.com>
> Subject: Re: [c-nsp] Help with output drops
> To: "Tony" <td_miles at yahoo.com>, cisco-nsp at puck.nether.net
> Date: Monday, 13 July, 2009, 11:28 PM
> Hi Tony,
> 
> After disabling QoS there are no longer any output
> drops.  Thanks for the
> suggestion.
> 
> Are there any features that rely on QoS, or is it a default
> setting?  I'm
> trying to figure out something reasonable as to why it was
> enabled in the
> first place.
> 
> --
> Randy
> 
> ---------- Original Message -----------
> From: Tony <td_miles at yahoo.com>
> To: cisco-nsp at puck.nether.net,
> Randy McAnally <rsm at fast-serv.com>
> Sent: Sun, 12 Jul 2009 23:21:47 -0700 (PDT)
> Subject: Re: [c-nsp] Help with output drops
> 
> > Hi Randy,
> > 
> > Is QoS enabled ? What does "show mls qos" tell you ?
> > 
> > Do you need QOS at all ? If not, disable it globally
> (no mls qos)
> >  and your problem might just go away if it's
> being caused by queue 
> > threshold defaults..
> > 
> > If it's production switch, do it during a scheduled
> maintenance 
> > period as it might disrupt traffic for a second.
> > 
> > regards,
> > Tony.
> > 
> > --- On Mon, 13/7/09, Randy McAnally <rsm at fast-serv.com>
> wrote:
> > 
> > > From: Randy McAnally <rsm at fast-serv.com>
> > > Subject: [c-nsp] Help with output drops
> > > To: cisco-nsp at puck.nether.net
> > > Date: Monday, 13 July, 2009, 1:51 PM
> > > Hi all,
> > > 
> > > I just finished installing and configuring a new
> 6509 with
> > > dual sup7203bxl
> > > (12.2(18)SXF15a) and a 6724 linecards.  It
> serves a
> > > simple purpose of
> > > maintaining a single BGP session, and managing
> layer3
> > > (vlans) for various
> > > access switches..  No end devices are connected.
> > > 
> > > The problem is that we are getting constant
> output drops
> > > when our gig-E uplink
> > > goes above ~400 mbps.  Nowhere near the
> interface
> > > speed!  See below, take note
> > > of massive 'Total output drops' with no other
> errors (on
> > > either end):
> > > 
> > > rtr1.ash#sh int g1/1
> > > GigabitEthernet1/1 is up, line protocol is up
> (connected)
> > >   Hardware is C6k 1000Mb 802.3, address is
> > > 00d0.01ff.5800 (bia 00d0.01ff.5800)
> > >   Description: PTP-UPLINK
> > >   Internet address is 209.9.224.68/29
> > >   MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
> > >      reliability 255/255, txload
> > > 118/255, rxload 12/255
> > >   Encapsulation ARPA, loopback not set
> > >   Keepalive set (10 sec)
> > >   Full-duplex, 1000Mb/s, media type is T
> > >   input flow-control is off, output flow-control
> is
> > > off
> > >   Clock mode is auto
> > >   ARP type: ARPA, ARP Timeout 04:00:00
> > >   Last input 00:00:00, output 00:00:01, output
> hang
> > > never
> > >   Last clearing of "show interface" counters
> 05:01:25
> > >   Input queue: 0/1000/0/0
> (size/max/drops/flushes);
> > > Total output drops: 718023
> > >   Queueing strategy: fifo
> > >   Output queue: 0/100 (size/max)
> > >   30 second input rate 47789000 bits/sec, 30797
> > > packets/sec
> > >   30 second output rate 465362000 bits/sec,
> 48729
> > > packets/sec
> > >   L2 Switched: ucast: 27775 pkt, 2136621 bytes
> -
> > > mcast: 24590 pkt, 1574763 bytes
> > >   L3 in Switched: ucast: 592150327 pkt,
> 95608889548
> > > bytes - mcast: 0 pkt, 0
> > > bytes mcast
> > >   L3 out Switched: ucast: 991372425 pkt,
> 1214882993007
> > > bytes mcast: 0 pkt, 0 bytes
> > >      592554441 packets input,
> > > 95674494492 bytes, 0 no buffer
> > >      Received 33643 broadcasts (17872
> > > IP multicasts)
> > >      0 runts, 0 giants, 0 throttles
> > >      0 input errors, 0 CRC, 0 frame, 0
> > > overrun, 0 ignored
> > >      0 watchdog, 0 multicast, 0 pause
> > > input
> > >      0 input packets with dribble
> > > condition detected
> > >      991006394 packets output,
> > > 1214377864373 bytes, 0 underruns
> > >      0 output errors, 0 collisions, 0
> > > interface resets
> > >      0 babbles, 0 late collision, 0
> > > deferred
> > >      0 lost carrier, 0 no carrier, 0
> > > PAUSE output
> > >      0 output buffer failures, 0 output
> > > buffers swapped out
> > > 
> > > The CPU usage is nil:
> > > 
> > > rtr1..ash#sh proc cpu sort
> > > 
> > > CPU utilization for five seconds: 1%/0%; one
> minute: 0%;
> > > five minutes: 0%
> > >  PID Runtime(ms)   Invoked   
> > >  
> > > uSecs   5Sec   1Min   5Min
> > > TTY Process
> > >    6     3036624 
> > >   252272      12037  0.47% 
> > > 0.19%  0.18%   0 Check heaps
> > >  316      195004 
> > >    99543   
> > >    1958  0.15%  0.01% 
> > > 0.00%   0 BGP Scanner
> > >  119     
> > > 267568   2962884     
> > >    90  0.15%  0.03% 
> > > 0.02%   0 IP Input
> > >  172     
> > > 413528   2134933       
> > > 193  0.07%  0.03%  0.02%   0
> > > CEF process
> > >    4         
> > > 16     48214       
> > >   0  0.00%  0.00% 
> > > 0.00%   0 cpf_process_ipcQ
> > >    3       
> > >    0     
> > >    2         
> > > 0  0.00%  0.00%  0.00%   0
> > > cpf_process_msg_
> > >    5       
> > >    0     
> > >    1         
> > > 0  0.00%  0.00%  0.00%   0 PF
> > > Redun ICC Req
> > >    2     
> > >    772    298376   
> > >       2  0.00%  0.00% 
> > > 0.00%   0 Load Meter
> > >    9   
> > >    23964    157684   
> > >     151  0.00%  0.01% 
> > > 0.00%   0 ARP Input
> > >    7       
> > >    0     
> > >    1         
> > > 0  0.00%  0.00%  0.00%   0
> > > Pool Manager
> > >    8       
> > >    0     
> > >    2         
> > > 0  0.00%  0.00%  0.00%   0
> > > Timers
> > > <<<snip>>>
> > > 
> > > I THINK I have determined the drops are caused by
> buffer
> > > congestion on the port:
> > > 
> > > rtr1.ash#sh queueing interface gigabitEthernet
> 1/1 
> > > 
> > > rtr1.ash#sh queueing interface gigabitEthernet
> 1/1
> > > Interface GigabitEthernet1/1 queueing
> strategy: 
> > > Weighted Round-Robin
> > >   Port QoS is enabled
> > >   Port is untrusted
> > >   Extend trust state: not trusted [COS = 0]
> > >   Default COS is 0
> > >     Queueing Mode In Tx direction: mode-cos
> > >     Transmit queues [type = 1p3q8t]:
> > >     Queue Id    Scheduling  Num of
> > > thresholds
> > >     -----------------------------------------
> > >        01     
> > >    WRR         
> > >        08
> > >        02     
> > >    WRR         
> > >        08
> > >        03     
> > >    WRR         
> > >        08
> > >        04     
> > >    Priority         
> > >   01
> > > 
> > >     WRR bandwidth ratios:  100[queue 1]
> > > 150[queue 2] 200[queue 3]
> > >     queue-limit ratios: 
> > >    50[queue 1]  20[queue 2] 
> > > 15[queue 3]  15[Pri Queue]
> > > 
> > > <<<snip>>>
> > > 
> > >   Packets dropped on Transmit:
> > > 
> > >     queue     dropped 
> > > [cos-map]
> > >    
> > > ---------------------------------------------
> > >     1           
> > >        719527  [0 1 ]
> > >     2           
> > >             0  [2 3 4 ]
> > >     3           
> > >             0  [6 7 ]
> > >     4           
> > >             0  [5 ]
> > > 
> > > So it would appear all of my traffic goes into
> queue
> > > 1.  It would also seem
> > > that 50% buffers for queue 1 isn't enough? 
> These are
> > > the default settings by
> > > the way.
> > > 
> > > I'm pretty sure that wrr-queue queue-limit and
> wrr-queue
> > > bandwidth should help
> > > us mitigate this frustrating packet loss, but
> I've no
> > > experience and would
> > > like some insight and suggestions before I start
> making
> > > changes.  I am totally
> > > unfamiliar with these features (I come from
> Foundry/Brocade
> > > background) and
> > > would like any suggestions or advise you might
> have before
> > > I try anything that
> > > could risk downtime or further issues in a
> production
> > > environment.
> > > 
> > > And lastly, would changing the queue settings
> cause BGP to
> > > drop or anything
> > > else unexpected (like changing flow control would
> reset the
> > > interface, ect)?
> > > 
> > > Thank you!
> > > 
> > > --
> > > Randy
> > > www.FastServ..com
> > > 
> > > _______________________________________________
> > > cisco-nsp mailing list  cisco-nsp at puck.nether.net
> > > https://puck..nether.net/mailman/listinfo/cisco-nsp
> > > archive at http://puck.nether.net/pipermail/cisco-nsp/
> > >
> ------- End of Original Message -------
> 
>