[j-nsp] ATM PIC and congestion
John Kristoff
jtk@aharp.is-net.depaul.edu
Wed, 30 Oct 2002 17:06:17 -0600
Those of you on the NANOG list may remember seeing from me an inquiry
there a short while back asking if anyone saw congestion at the Chicago
SBC/AADS NAP.
After much hair pulling by myself, my team and vendors, we've finally
figured out a latency problem that ended up being due to the way the
Juniper ATM1 PIC handles high traffic loads. The following is a summary
of the problem and some information about ATM buffer management that is
not yet available publicly from Juniper (I was told it was OK to share).
We had an M5 with an OC3c at the Chicago SBC/AADS NAP using the Juniper
ATM1 PIC. We had a high rate of utilization on the entire link, but
traffic on the outbound to the Internet was maxed, with the highest
percentage of outbound traffic going to our primary upstream. Recently
we begun experiencing latency on the order of hundreds of milliseconds.
At first it looked like a latency problem on multiple PVCs, but
eventually realized it was concentrated on our PVC with the most
outbound traffic.
After checking for latency in the ATM switch network, on the far end and
problems with our own gear, Juniper support determined that it appeared
to be how the ATM interface buffers operate. So we tweaked those queue
lengths and the latency problem went away. Now instead we get packet
drops, but that is more normal and we'll handle that in other ways (the
CoS and the ATM2 PIC thread today is very relevant for us as you might
guess). Below is some information some of you may find useful. This
information was apparently written by a Juniper escalation engineer and
hasn't made it into any available documentation yet.
ATM1 PICs contain a transmit buffer pool of 16382 buffers, which
are shared by all PVCs currently configured on the PIC. Even on
multi-phy ATM PICs, there is still a single buffer pool shared
by all the PHYs.
By default, the PIC allows PVCs to consume as many buffers as they
require. If the sustained traffic rate for a PVC exceeds it's shaped
rate, buffers will be consumed. Eventually, all buffers on the PIC
will be used, which will starve other PVC's. This results in
head-of-line blocking.
The queue-length parameter can be set (on a per-PVC basis) to prevent
this situation. It sets a limit on the number of transmit packets
(and ultimately buffers) that can be queued up to a PVC. New packets
that would exceed this limit get dropped (ie: tail-dropping).
queue-length, configured under the shaping heirarchy, represents the
maximum number of packets which can be queued for the PVC using the
global buffers. It should be configured for all PVCs when more than
one PVC is configured on an ATM1 PIC. It perfroms two functions.
1) It stops head of line blocking occuring since it limits the
number of packets and hence buffers that can be consumed by each
configured PVC.
2) It sets the maximum lifetime which can be sustained by packets
over the PVC when traffic has oversubscribed the configured shaping
contract.
The total of all the queue-length settings must not be greater then
the total number of packets which can be held in the buffer space
available on the PIC.
The total number of packets which can be held by the buffers is
calculated dependent on the MTU setting for the interfaces on the
PIC. The MTU used should include all ecapsulation overhead and hence
is the physical interface MTU. The following formula can be used to
calculate the total number of packets the buffer space can hold:
16,382 / ( Round Up ( MTU / 480 ) )
For exmaple, with the default MTU settings for ATM1 PIC interfaces,
the total for the number of packets which can be held is:
16,382 / ( Round Up ( 4,482 / 480 ) ) = 1638 packets.
Thus, when configuring the queue-lengths for each of the PVCs
configured on an ATM1 PIC using default MTU settings, they must not
total to more then 1638. They can total to less.
Setting a queue-length to a very low value is possible, yet doing
this risks not being able to buffer small bursts in packets transiting
the PVC.
The maximum lifetime which could be sustained by packets transiting
a PVC can be calculated dependent on the shaping rate configured for
the PVC, the setting for queue-length and the MTU. The following
formula can be used:
( PVC queue-length in packets x MTU ) / ( PVC shaping in bits per
second / 8 )
For example, say a PVC is configured on an ATM1 PIC interface with the
default MTU and a CBR shaping rate of 3,840,000bps (10,000 cells per
second). The queue-length has been set to 25 packets. The maximum
lifetime is:
( 25 x 4,482 ) / ( 3,840,000 / 8 ) = 233ms.
This is the worst case lifetime assuming all packets in the queue are
MTU sized and the traffic using the PVC is oversubscribing its
configured shaping contract.
In general its a good design practice to keep this maximum lifetime to
a value under 500ms.
So, if you've got high load and high latency over your ATM1 PIC, you may
need to tweak your lengths using the info above.
John