[j-nsp] duplicate acks, EX3300 VC
Phil Mayers
p.mayers at imperial.ac.uk
Thu May 17 13:30:54 EDT 2012
On 17/05/12 17:16, Mike Williams wrote:
> Hey all,
>
> Before I punt this to JTAC, has anyone had any experience with
> poor/highly-variable TCP throughput from a small stack of EX3300s?
This is *through* the switch, yes? Not *to* it?
> We've got a stack of 3, one 48 port, and two 24 ports, and since they went in
> we can't get reliable TCP transfers transatlantic.
> Linux-Linux can go really fast, but involve Windows and we get a pityful
> ~100KBps, regardless of tuning done.
> Junos is 11.4R2.14.
I have no experience of that platform (or indeed any Juniper switch) but
this sounds awfully like packet drops due to small buffers.
Linux has a whole bunch of pluggable/selectable TCP congestion control
algorithms, and the defaults are, usually, much better behaved in the
face of packet loss than those on Windows, which could explain why you
see different behaviour with different OSes.
>
> It's taken us *forever* to hone in on the issue possible being the EXs,
> because who'd have thought a switch couldn't handle packets at a few 10s of
> megabytes per second (10-20k PPS x 3).
That is (presumably) the bulk/aggregate throughput. The instantaneous
throughput might be (a lot) higher, depending on the TCP window size,
the inter-packet spacing, whether TCP segmentation offload is in use,
and so forth.
If the switch has small buffers (which cheap switches often do) then an
instantaneous burst to line rate, combined with traffic to/from other
ports, can cause drops. These drops can KILL TCP performance without
adequate TCP stack tuning, and a decent congestion control algorithm.
>
> To cut a looooooooong story short;
> <internet><srx650><ex3300><linux firewall><same ex3300><server>
> Linux firewall sees the 2 initial TCP packets correctly, but the server
> generally only gets the second one, or if it gets the first it's after the
> second. Then we're into a bazillion duplicate acks, out-of-order packets, and
> TCP retransmissions.
Roughly how many dropped packets are you seeing, as a ratio?
Out-of-order packets is a bit odd; are you doing something peculiar like
per-packet load balancing?
>
> I found the 'show system statistics tcp' command a short while ago and it's,
> well, "interesting".
>
>
>> show system statistics tcp
> fpc0:
> --------------------------------------------------------------------------
> Tcp:
> 84769061 packets sent
> 16676437 data packets (2039615568 bytes)
> 1416 data packets retransmitted (1526176 bytes)
Are you sure this command shows what you think it does?
This looks awfully like statistics for the local operating system i.e.
the TCP stack on the switch, used to handle telnet/SSH/other management.
To gather these kinds of stats for *forwarded* traffic implies the
switch is doing TCP header inspection (unlikely) as you need to know TCP
connection status.
More information about the juniper-nsp
mailing list