[c-nsp] TCP behavior under strict CAR rate-limiting

Christopher Hunt chunt at reachone.com
Thu Jun 19 19:16:19 EDT 2008


Bill,
    I am starting to see.   I also realize now that the 
TCP_congestion_window (cwnd) is not the same as the RWIN.  The is like 
the RWIN for the slow_start state.  The cwnd value is not in the packets 
themselves (forgive me if you already know this. Posterity may not...) 
is best inferred from the pattern of acks  ( see 
http://marc.info/?l=wireshark-users&m=118832373316262&w=2 ). 
    It would appear from the sender's counters and from the snmp checks 
on the router interface that the interface never hits 10mbps even for a 
second, but the rate-limiting counters do show tail drops. I guess it is 
difficult to get the sub-second granularity necessary to see the process 
in action and by the time the counters are hit again, they've averaged 
out over the second.  I know, for example that the the stats provided by 
"ifconfig" under RedHat only seem to update every 1-2 seconds.  
Similarly SNMP is polled at most each second and while it shows no 
spikes, the interface must be receiving spikes > CAR (even if only for 
microseconds?)in order to drop packets, right?  I wonder how the 
rate-limiting counters really work with the Cisco.  It obviously doesn't 
do the math each second, but instead i guess it does the math each time 
a packet arrives and the 1 second inteface counters  obviously don't 
show burst that last a few microseconds.
    I wish i could see the cwnd in action, but I guess I'll have to 
content myself with what we can see.  Bill et al., thanks very much for 
checking this out.  I hope to be useful to others some day ;-)

Christopher Hunt


bill fumerola wrote:
> On Thu, Jun 19, 2008 at 03:07:27PM -0700, Christopher Hunt wrote:
>   
>>    I am familiar with TCP's concept of Slow Start, but my understanding 
>> is that it is the RWIN that is slow to start.  The packet does show the 
>> first packet as 24 Byte payload, but even then the client RWIN is 5888 
>> (scaled x7) (CentOS running 2.6.18 kernel).   The "server" is XP Pro 
>> running an RWIN 65535 with scaling disabled.  As far as I can tell, TCP 
>> slow start is not happenning.  What other signs of Slow Start should i 
>> be looking for?
>>     
>
> every second (+/- .1s) the drops occur and SACK kicks in:
>
> 23:05:46.809550 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 115865:117313(1448) ack 1 win 46 <nop,nop,timestamp 754051101 746671>
> 23:05:46.810977 IP 10.180.55.211.commplex-link > 192.168.10.2.33538: . ack 117313 win 65535 <nop,nop,timestamp 746671 754051101>
> 23:05:46.810997 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 117313:121657(4344) ack 1 win 46 <nop,nop,timestamp 754051103 746671>
>
> ack 120209 begins, backlog to 126001 occurs:
>
> 23:05:46.812489 IP 10.180.55.211.commplex-link > 192.168.10.2.33538: . ack 120209 win 65535 <nop,nop,timestamp 746671 754051103>
> 23:05:46.812508 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 121657:124553(2896) ack 1 win 46 <nop,nop,timestamp 754051104 746671>
>
> SACK kicks in:
>
> 23:05:46.813864 IP 10.180.55.211.commplex-link > 192.168.10.2.33538: . ack 120209 win 65535 <nop,nop,timestamp 746671 754051103,nop,nop,sack 1 {121657:123105}>
> 23:05:46.813883 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 124553:126001(1448) ack 1 win 46 <nop,nop,timestamp 754051105 746671>
> 23:05:46.814051 IP 10.180.55.211.commplex-link > 192.168.10.2.33538: . ack 120209 win 65535 <nop,nop,timestamp 746671 754051103,nop,nop,sack 1 {121657:124553}>
> 23:05:46.814070 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 126001:127449(1448) ack 1 win 46 <nop,nop,timestamp 754051106 746671>
> 23:05:46.815196 IP 10.180.55.211.commplex-link > 192.168.10.2.33538: . ack 120209 win 65535 <nop,nop,timestamp 746671 754051103,nop,nop,sack 1 {121657:126001}>
>
> ack of last sack packet:
>
> 23:05:46.815214 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 120209:121657(1448) ack 1 win 46 <nop,nop,timestamp 754051107 746671>
>
> .4 second delay, retransmit of ack of last sack packet:
>
> 23:05:47.237107 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 120209:121657(1448) ack 1 win 46 <nop,nop,timestamp 754051529 746671>
>
> acks 120209-126001 finally occur, 10.180.55.211 moves on, .45 seconds later
>
> 23:05:47.238740 IP 10.180.55.211.commplex-link > 192.168.10.2.33538: . ack 126001 win 65535 <nop,nop,timestamp 746675 754051529>
> 23:05:47.238771 IP 192.168.10.2.33538 > 10.180.55.211.commplex-link: . 126001:127449(1448) ack 1 win 46 <nop,nop,timestamp 754051530 746675>
>
> this pattern repeats frequently. sometimes with a retransmit, sometimes
> without, always taking .3-.5 seconds or so. hence, the crap performance.
>
>
> -- bill
>   


More information about the cisco-nsp mailing list