[c-nsp] ME3600X Output Drops

Waris Sagheer (waris) waris at cisco.com
Mon Aug 27 00:34:13 EDT 2012


Problem Statement:
When there is a speed mismatch that is 10 Gig ingress interface/1Gig egress interface or 1 Gig ingress interface/100M egress interface, microbursts can happen.
For example, 10 Gig Ingress interface----->1 Gig Egress Interface
Traffic rate 500 M, traffic direction 10 Gig to 1 Gig interface. Instantaneous burst is possible which can cause the default queue buffers to run out at egress interface resulting in packet drops at the 1 Gig egress interface.

ME3800X/ME3600X/ME3600X-24CX default queue-limit values in time and bytes for 10/100/1000/10000 Mbps Interfaces: 
In Bytes:
10/100/1000/10000 Mbps --> 12/12/12/120 Kbytes respectively
In Time:
10/100/1000/10000 Mbps --> 10000/1000/100/100 usec respectively

How to fix this problem?
Current Solution:
Increase the queue-limit by using the "queue-limit xx" command.

Queue-Limit Ranges:
200 to 491520 bytes
1 to 3932 us 
1 to 2457 packets (Assuming 1 packet = 200 bytes) [Packet unit is supported in 15.1(2)EY] 

How to pick the right value?
Currently there is no other way except for trial and error. It is best to start from 200K Bytes and monitor the drops. Increase the queue-limit if the drops are still seen. 

Roadmap Queue-limit Feature Enhancements to fix this issue:
Step 1:We are planning to increase the default queue-limit to 40KB increase of 12 KB in case of 1Gig interface.
Step 2:We are planning to introduce a feature called flexible queue-limit in Release 15.3(1)S, Q4CY12 which would allow the queue-limit to be increases as percentages of buffer. 
Similar feature is supported on ME3400E.
http://www.cisco.com/en/US/docs/switches/metro/me3400e/software/release/12.2_58_ex/command/reference/cli1.html#wp5095786
Step 3:To pick the right value, watermark counter will be introduced in 2HCY13 which will record the maximum tail drop value. This will enable the customers to tweak their queue-limit value accordingly. The feature is currently available in IOS XR.
 
Queue-limit Value Configuration:
12.xx Release Output
Example of user configurable queue limit value, 

Switch(config-pmap-c)#queue-limit ?
  <1-491520>     <200-491520> in bytes, <1-3932> in us, maximum threshold (in us by default)

15.xx Outputs
There is an issue with the command help which shows higher value than supported by the platform, it will be fixed in the future release.
ME3800X-H-1(config-pmap-c)#queue-limit ?
  <1-8192000>    in bytes, <1-3400> in ms, <1-8192000> in packets by default << Range is shown higher than the platform can support
  

ME3800X-H-1(config-pmap-c)#queue-limit 8192000 bytes
QOS: Qlimit threshold value is out of range
Min and Max bytes qlimit are 200 & 491520 <<< Valid supported range
queue-limit: platform params check fail


ME3800X-H-1(config-pmap-c)#queue-limit 2500 packets 
QOS: Qlimit threshold value is out of range
Min and Max packets qlimit are 1 & 2457 << Valid supported range
queue-limit: platform params check fail


Queue-Limit Policy Configuration Example:
In many cases QoS policy will only be required to help with the issue of packet drops. ME platforms support three level hiearchy [Port, Vlan & Class] and Queue-limit is only supported at the class or third level.

Valid supported Queue-limit Policy Example:
class-map match-all vlan60
 match vlan  60
!
policy-map EFP-qlimit
 class vlan60                     <<<<< Using vlan level confirms it is the second level and the child policy is the third level
  shape average 100000000
  service-policy COS-OUT-L3-NSP
!
policy-map COS-OUT-L3-NSP
 class class-default
  queue-limit 256 packets

  
interface GigabitEthernet0/5
 switchport trunk allowed vlan none
 switchport mode trunk
 service instance 2 ethernet
  encapsulation dot1q 60
  rewrite ingress tag pop 1 symmetric
  service-policy output EFP-qlimit
  bridge-domain 60


Three Level Class-default Policy Example:
policy-map leaf
class class-default
queue-limit xxxxx bytes

policy-map logical
class class-default
service-policy leaf

policy-map root
class class-default
service-policy logical


Invalid Queue-Limit Policy Configuration Example:
This case "class-default" is being considered as the port level.
Following QOS policy configuration failed because the configuration check assumes user is trying to apply the queue-limit at the vlan level which is not supported.

policy-map child-1
 class class-default
  queue-limit 256 packets
!
policy-map VLAN-OUT
 class class-default       <<<<< Class default is being assumed at the port level , Child policy at the second level
  shape average 5000000
  service-policy child-1
!

interface GigabitEthernet0/5
 switchport trunk allowed vlan none
 switchport mode trunk
 !
 service instance 2 ethernet
  encapsulation dot1q 60
  rewrite ingress tag pop 1 symmetric
  bridge-domain 60

  
3600-HL-2-N(config)#interface GigabitEthernet0/5
3600-HL-2-N(config-if)#no service instance 1 ethernet
3600-HL-2-N(config-if-srv)#service-policy output VLAN-OUT
QOS: queue-limit command not supported in non-leaf classes
QoS: Policy attachment failed for policymap VLAN-OUT
*Feb 13 09:55:28.700: %QOSMGR-3-QLIMIT_LEVEL_ERROR: Qlimit command not supported in non-leaf classes
Regards,
Waris


-----Original Message-----
From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Richard Clayton
Sent: Thursday, August 23, 2012 4:31 AM
To: George Giannousopoulos
Cc: cisco-nsp
Subject: Re: [c-nsp] ME3600X Output Drops

George

I believe you will be able to specify a % of the available buffer for queue-limit in a future release and you will also be able to specify 100% of the buffer for each individual queue-limit.

Thanks
Sledge


On 23 August 2012 11:57, George Giannousopoulos <ggiannou at gmail.com> wrote:

> If I remember correctly, 2457 packets is the maximum on this platform 
> We weren't given any specific version for the increase default values
>
> In case you get anything extra from your SR, it would be nice to share 
> it with us
>
> George
>
> On Thu, Aug 23, 2012 at 12:10 PM, Ivan <cisco-nsp at itpro.co.nz> wrote:
>
> > Thanks George.  I am raising a SR to get some more information too. 
> > Are you able to explain how the queue-limit of 2457 was selected? 
> > Also were
> you
> > given a version for the increase in the default queue size?  I am 
> > running me360x-universalk9-mz.152-2.**S1.bin
> >
> > Cheers
> >
> > Ivan
> >
> >
> >
> > On 23/Aug/2012 5:48 p.m., George Giannousopoulos wrote:
> >
> >> Hi Ivan,
> >>
> >> In fact the default queue limit in 3800x/3600x is quite small We 
> >> also had issues with drops in all interfaces, even without 
> >> congestion
> >>
> >> After some research and an SR with Cisco, we have started applying 
> >> qos
> on
> >> all interfaces
> >>
> >> policy-map INTERFACE-OUTPUT-POLICY
> >>   class dummy
> >>   class class-default
> >>    shape average X00000000
> >>    queue-limit 2457 packets
> >>
> >>
> >> The dummy class does nothing.
> >> It is just there because IOS wouldn't allow changing queue limit
> otherwise
> >>
> >> Also there were issues with the policy counters which should be 
> >> resolved
> >> after15.1(2)EY2
> >> Cisco said they would increase the default queue sizes in the 
> >> second
> half
> >> of 2012..
> >> So, I suggest you try the latest IOS version and check again
> >>
> >> 10G interfaces had no drops in our setup too.
> >>
> >> Regards
> >> George
> >>
> >>
> >> On Thu, Aug 23, 2012 at 1:34 AM, Ivan <cisco-nsp at itpro.co.nz <mailto:
> >> cisco-nsp at itpro.co.nz>**> wrote:
> >>
> >>     Replying to my own message....
> >>
> >>     * Adjusting the hold queue didn't help.
> >>
> >>     * Applying QOS and per referenced email stopped the drops
> >>     immediately - I
> >>     used something like the below:
> >>
> >>     policy-map leaf
> >>     class class-default
> >>     queue-limit 491520 bytes
> >>
> >>     policy-map logical
> >>     class class-default
> >>     service-policy leaf
> >>
> >>     policy-map root
> >>     class class-default
> >>     service-policy logical
> >>
> >>     * I would be interested to hear if others have ended up applying a
> >>     similar
> >>     policy to all interfaces.  Any gotchas?  I expect any 10Gbps
> >>     interfaces
> >>     would be okay without the QoS - haven't seen any issue on these
> >>     myself.
> >>
> >>     *Apart from this list I have found very little information 
> >> around
> this
> >>     whole issue.  Any pointers to other documentation would be
> >>     appreciated.
> >>
> >>     Thanks
> >>
> >>     Ivan
> >>
> >>     Ivan
> >>
> >>     > Hi,
> >>     >
> >>     > I am seeing output drops on a ME3600X interface as shown below
> >>     >
> >>     > GigabitEthernet0/2 is up, line protocol is up (connected)
> >>     >   MTU 9216 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
> >>     >      reliability 255/255, txload 29/255, rxload 2/255
> >>     >   Encapsulation ARPA, loopback not set
> >>     >   Keepalive set (10 sec)
> >>     >   Full-duplex, 1000Mb/s, media type is RJ45
> >>     >   input flow-control is off, output flow-control is unsupported
> >>     >   ARP type: ARPA, ARP Timeout 04:00:00
> >>     >   Last input 6w1d, output never, output hang never
> >>     >   Last clearing of "show interface" counters 00:12:56
> >>     >   Input queue: 0/75/0/0 (size/max/drops/flushes); Total output
> >>     drops: 231
> >>     >   Queueing strategy: fifo
> >>     >   Output queue: 0/40 (size/max)
> >>     >   30 second input rate 10299000 bits/sec, 5463 packets/sec
> >>     >   30 second output rate 114235000 bits/sec, 12461 packets/sec
> >>     >      3812300 packets input, 705758638 bytes, 0 no buffer
> >>     >      Received 776 broadcasts (776 multicasts)
> >>     >      0 runts, 0 giants, 0 throttles
> >>     >      0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
> >>     >      0 watchdog, 776 multicast, 0 pause input
> >>     >      0 input packets with dribble condition detected
> >>     >      9103882 packets output, 10291542297 bytes, 0 underruns
> >>     >      0 output errors, 0 collisions, 0 interface resets
> >>     >      0 unknown protocol drops
> >>     >      0 babbles, 0 late collision, 0 deferred
> >>     >      0 lost carrier, 0 no carrier, 0 pause output
> >>     >      0 output buffer failures, 0 output buffers swapped out
> >>     >
> >>     > I have read about similar issues on the list:
> >>     > http://www.gossamer-threads.**com/lists/cisco/nsp/157217<
> http://www.gossamer-threads.com/lists/cisco/nsp/157217>
> >>     > https://puck.nether.net/**pipermail/cisco-nsp/2012-July/**
> >> 085889.html<
> https://puck.nether.net/pipermail/cisco-nsp/2012-July/085889.html>
> >>     >
> >>     > 1. I have no QoS policies applied to the physical interface 
> >> or
> EVCs.
> >>     > Would increasing the hold queue help?  Is there a recommended
> >>     value - the
> >>     > maximum configurable is 240000.  What is the impact on the 44MB
> >>     of packet
> >>     > buffer.
> >>     >
> >>     > 2. If the hold queue isn't an option is configuring QoS 
> >> required
> to
> >>     > increase the queue-limit from the default 100us.  Again are
> >>     there any
> >>     > recommended values and what impact is there on the available 
> >> 44MB
> of
> >>     > packet buffer.
> >>     >
> >>     > 3. I have found that when applying policies to the EVCs the
> >>     "show policy
> >>     > map" output does not have information for the queue-limit as I
> >>     have seen
> >>     > when applying polices to the physical interface.  Does this mean
> >>     that EVCs
> >>     > will still suffer from output drops?
> >>     >
> >>     > Thanks
> >>     >
> >>     > Ivan
> >>
> >>
> >>
> >>     ______________________________**_________________
> >>     cisco-nsp mailing list cisco-nsp at puck.nether.net
> >>     <mailto:cisco-nsp at puck.nether.**net 
> >> <cisco-nsp at puck.nether.net>>
> >>
> >>     https://puck.nether.net/**mailman/listinfo/cisco-nsp<
> https://puck.nether.net/mailman/listinfo/cisco-nsp>
> >>     archive at http://puck.nether.net/**pipermail/cisco-nsp/<
> http://puck.nether.net/pipermail/cisco-nsp/>
> >>
> >>
> >>
> > ______________________________**_________________
> > cisco-nsp mailing list  cisco-nsp at puck.nether.net 
> > https://puck.nether.net/**mailman/listinfo/cisco-nsp<
> https://puck.nether.net/mailman/listinfo/cisco-nsp>
> > archive at http://puck.nether.net/**pipermail/cisco-nsp/<
> http://puck.nether.net/pipermail/cisco-nsp/>
> >
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net 
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/



More information about the cisco-nsp mailing list