[c-nsp] Performance issue on link

Wed Apr 10 15:35:34 EDT 2013

Hi,

On Thu, Apr 04, 2013 at 09:52:30AM +1100, CiscoNSP List wrote:
> If I run 1000 pings (1500 byte, df-bit set) in both directions, I am not seeing any packet loss?

IOS ping will send a new packet only when the previous one was answered, or
after a timeout (defaults to 2s). You're not going to burst an otherwise
empty link into packet loss this way. On Linux, use flood ping to get
close to the point where dropping starts, but it's still a challenge at
40Mbps (without using supersized pings, bringing fragmentation into the
picture which should better stay out).

> #tcp window size in the "bad" direction:
> 
> starts at win=6912 and grows to 165504 (This is where I start to see a heap of TCP Dup ACK and TCP segment of a reassembled PDU)), then increases to 353408 (Again, more TCP Dup ACK's), and looks to max out at 669056.
> 
> #tcp window size in the "good" direction:
> 
> starts at win=6912 and looks to max out at 1995392 and has very few TCP Dup ACK and TCP segment of a reassembled PDU

I think what you're seeing is two different Linux kernel versions, one of
them from the ancient times before TCP window scaling patches allowed for
2MB+ window sizes. AFAIK 2.6.18 (as of RHEL5 fame) was around the time
when the patches hit mainline (dunno if before or after, though) - RHEL4
(2.6.9) was before them. There have been a lot of changes in the versions
between 2.6.18 and more contemporary ones, around the late 2.6.20s and
early 2.6.30s a bunch of alternate TCP congestion control mechanisms where
committed for instance. All that can easily explain your issues.

Just remember the first answer you got: BDP. If you really have 65ms
latency (in other words, 130ms RTT) on a 40Mbps link, this is a very
high and atypical BDP for that bandwidth. The native serialization
delay for a 1500 byte frame on a 40Mbps interface is 0.3ms, you're two
orders of magnitude away from that. 65ms at 40Mbps means a BDP of 325kB,
the "capacitance" of the "line" (or rather buffer-bloated construct)
you can put into it before it falls out at the other end. Large RWINs
are absolutely mandatory to deal with that, and 669056 might well be
too small (to repeat that question, you've factored the negotiated
window scaling into these numbers?).

There are a lot of controls:

# grep '' /proc/sys/net/ipv4/tcp_*
[...]
/proc/sys/net/ipv4/tcp_adv_win_scale:1
/proc/sys/net/ipv4/tcp_allowed_congestion_control:cubic reno
/proc/sys/net/ipv4/tcp_app_win:31
/proc/sys/net/ipv4/tcp_available_congestion_control:cubic reno
/proc/sys/net/ipv4/tcp_base_mss:512
/proc/sys/net/ipv4/tcp_challenge_ack_limit:100
/proc/sys/net/ipv4/tcp_congestion_control:cubic
[...]

My 3.8.6 here shows a whopping 54 tcp_ control files, the above is just
a teaser (but it starts with one of the central settings, advanced window
scaling behavior) and also shows the available, allowed and currently
active congestion control algorithms.

Then again, you may have tuned your Linux servers to cope well with the
line, but sadly that doesn't help much with clients and even servers
you don't have control over. They will not like that line, and in the
case of old Windows versions, there's really not much you can do about
it. Apart from hoping that statistically, the line will fill up anyway...

HTH,
Andre.
-- 
                    Cool .signatures are so 90s...

-> Andre Beck    +++ ABP-RIPE +++      IBH IT-Service GmbH, Dresden <-