[c-nsp] Help with output drops

Randy McAnally rsm at fast-serv.com
Sun Jul 12 23:51:11 EDT 2009

Hi all,

I just finished installing and configuring a new 6509 with dual sup7203bxl
(12.2(18)SXF15a) and a 6724 linecards.  It serves a simple purpose of
maintaining a single BGP session, and managing layer3 (vlans) for various
access switches.  No end devices are connected.

The problem is that we are getting constant output drops when our gig-E uplink
goes above ~400 mbps.  Nowhere near the interface speed!  See below, take note
of massive 'Total output drops' with no other errors (on either end):

rtr1.ash#sh int g1/1
GigabitEthernet1/1 is up, line protocol is up (connected)
  Hardware is C6k 1000Mb 802.3, address is 00d0.01ff.5800 (bia 00d0.01ff.5800)
  Description: PTP-UPLINK
  Internet address is
  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
     reliability 255/255, txload 118/255, rxload 12/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, media type is T
  input flow-control is off, output flow-control is off
  Clock mode is auto
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:01, output hang never
  Last clearing of "show interface" counters 05:01:25
  Input queue: 0/1000/0/0 (size/max/drops/flushes); Total output drops: 718023
  Queueing strategy: fifo
  Output queue: 0/100 (size/max)
  30 second input rate 47789000 bits/sec, 30797 packets/sec
  30 second output rate 465362000 bits/sec, 48729 packets/sec
  L2 Switched: ucast: 27775 pkt, 2136621 bytes - mcast: 24590 pkt, 1574763 bytes
  L3 in Switched: ucast: 592150327 pkt, 95608889548 bytes - mcast: 0 pkt, 0
bytes mcast
  L3 out Switched: ucast: 991372425 pkt, 1214882993007 bytes mcast: 0 pkt, 0 bytes
     592554441 packets input, 95674494492 bytes, 0 no buffer
     Received 33643 broadcasts (17872 IP multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 0 multicast, 0 pause input
     0 input packets with dribble condition detected
     991006394 packets output, 1214377864373 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

The CPU usage is nil:

rtr1.ash#sh proc cpu sort

CPU utilization for five seconds: 1%/0%; one minute: 0%; five minutes: 0%
 PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process
   6     3036624    252272      12037  0.47%  0.19%  0.18%   0 Check heaps
 316      195004     99543       1958  0.15%  0.01%  0.00%   0 BGP Scanner
 119      267568   2962884         90  0.15%  0.03%  0.02%   0 IP Input
 172      413528   2134933        193  0.07%  0.03%  0.02%   0 CEF process
   4          16     48214          0  0.00%  0.00%  0.00%   0 cpf_process_ipcQ
   3           0         2          0  0.00%  0.00%  0.00%   0 cpf_process_msg_
   5           0         1          0  0.00%  0.00%  0.00%   0 PF Redun ICC Req
   2         772    298376          2  0.00%  0.00%  0.00%   0 Load Meter
   9       23964    157684        151  0.00%  0.01%  0.00%   0 ARP Input
   7           0         1          0  0.00%  0.00%  0.00%   0 Pool Manager
   8           0         2          0  0.00%  0.00%  0.00%   0 Timers

I THINK I have determined the drops are caused by buffer congestion on the port:

rtr1.ash#sh queueing interface gigabitEthernet 1/1 

rtr1.ash#sh queueing interface gigabitEthernet 1/1
Interface GigabitEthernet1/1 queueing strategy:  Weighted Round-Robin
  Port QoS is enabled
  Port is untrusted
  Extend trust state: not trusted [COS = 0]
  Default COS is 0
    Queueing Mode In Tx direction: mode-cos
    Transmit queues [type = 1p3q8t]:
    Queue Id    Scheduling  Num of thresholds
       01         WRR                 08
       02         WRR                 08
       03         WRR                 08
       04         Priority            01

    WRR bandwidth ratios:  100[queue 1] 150[queue 2] 200[queue 3]
    queue-limit ratios:     50[queue 1]  20[queue 2]  15[queue 3]  15[Pri Queue]


  Packets dropped on Transmit:

    queue     dropped  [cos-map]
    1                   719527  [0 1 ]
    2                        0  [2 3 4 ]
    3                        0  [6 7 ]
    4                        0  [5 ]

So it would appear all of my traffic goes into queue 1.  It would also seem
that 50% buffers for queue 1 isn't enough?  These are the default settings by
the way.

I'm pretty sure that wrr-queue queue-limit and wrr-queue bandwidth should help
us mitigate this frustrating packet loss, but I've no experience and would
like some insight and suggestions before I start making changes.  I am totally
unfamiliar with these features (I come from Foundry/Brocade background) and
would like any suggestions or advise you might have before I try anything that
could risk downtime or further issues in a production environment.

And lastly, would changing the queue settings cause BGP to drop or anything
else unexpected (like changing flow control would reset the interface, ect)?

Thank you!


More information about the cisco-nsp mailing list