[c-nsp] Weird throughput issue

Sun Jul 24 14:11:46 EDT 2016

Hi Andrew,

Thank you for the reply.  The cat4900 and 6504 are acting as a pure Layer 2
trunking device.   The carrier NNI comes into the Cat4900 and the customer
site is a circuit off of that carrier NNI.  The WAN of the site is a
sub-interface on the ASR9k that terminates the transport circuit from the
Cat6504.  The customer circuit on the carrier NNI is a VLAN that is trunked
through these switches to the ASR9K.

I also forgot to mention that in addition to doing IPERF to IPERF testing I
have been doing IPERF to TTCP testing directly to the router since the
customer can not leave a laptop plugged into router.  I know TTCP has huge
limitations for higher bandwidth circuits.  Cisco routers treat traffic
directed at them differently than traffic passing through them.  I have
found the 2911 capable of doing at most 25M or so with the processor maxed
out.  Based on this I know I should at least see 20-25M to this site from
my IPERF servers.  Latency to one of my IPERF servers is 17ms and the other
is 35ms.  The IPERF server that sees 17ms can not push anymore than 5-6M
through the router.  The IPERF server that sees 35ms can not push anything
more than 2M to the router.

The other customer site is another 100M that connects directly to the 6504
in this setup and terminates IP on the same ASR9k as the site with the
problem does.  The latency from both IPERF servers to this other site with
no issues is nearly the same (offset 1-2ms) as the other site.  When I
IPERF to TTCP test to this site I can push nearly 20-25M based on the
calculation above with no issues (The processor maxes out near 25M).  When
I repeat the same test to the site with the problem (using the same
Transmit/Receive Window) the results are consistent with the paragraph
above,.

The only difference in the setup between both sites is 1) the carrier(s)
that is used and 2) the site with the problem trunks through a Cat4948
initially before hitting the 6504.  We have other customers off of the
Cat4948 who report no issues.  I have looked over the Cat4948 config and
there is nothing apparent that would hinder TCP taffic.

This whole problem started when the customer went to do a speed test from
the problem site and couldn't even break 5M on the download and 50-60M
upload.  When the customer speed tests from the other site that comes into
the Cat6504 that trunks up to the same ASR9K they can obtain 90/90.

I've also applied 90M shapers on both ends of the WAN to avoid hitting the
carrier policers in the middle and it had no effect.

Curtis

On Sun, Jul 24, 2016 at 1:23 PM, Andrew Miehs <andrew at 2sheds.de> wrote:

> On Sun, Jul 24, 2016 at 3:40 PM, Curtis Piehler <cpiehler2 at gmail.com>
> wrote:
> ...
> >
> > Customer Site (2911 Router) -> ILEC -> Transport Provider -> Carrier NNI
> > (1G Copper) -> Cat4900 -> Cat6504 -> ASR9k
> >
> > The ASR9k terminates the IP for the customer so the in between devices
> are
> > Layer 2 trunking all the way through.
> >
> > Latency from the customer site to the Cat infrastructure is only 3-4ms.
> > Once it trunks over to the ASR9k for IP termination it increases to 17ms
> (A
> > different facility).  We have several IPERF servers to test from however
> > the latency never exceeds 40ms to any given server.
> ..
>
> How are you measuring this latency on the Cat 4900 if the IP remote
> end of the L3 circuit is on the ASR9k
>
> > When performing IPERF tests we can not pass anything more than 10M to the
> > site (down stream).  Upload from the site maxes out around 50-60M when we
> > test to anything behind the ASR9k IP network.  If we test to an IPERF
> > server off of the Catalyst infrastructure where the NNI comes into we can
> > pass full throughput.  We have tried all different size windows on IEPRF
> > which do not yield any improved results.  The further away we get from
> the
> > site (again not exceeding 40ms) the less throughput is seen.  UDP tests
> to
> > the site are ok as we can force 85-90M and the site receives it with very
> > little packet loss.  The issue is just TCP.
>
> As you are only seeing this with TCP, it is usually a factor of
> receiver window size (and latency) or packet loss causing congestion
> avoidance to kick in. The fact that latency increased the problem
> indicates that it could very likely be a receiver window issue. Check
> whether window scaling has been enabled in the server and client
> operating systems.
>
> You might want to try out this calculator, and use 65535 for "TCP
> window (RWND) size"
>
> http://wintelguy.com/wanperf.pl
>
>
>
> -- Andrew
>