[c-nsp] strange RTT increase in ASR1006
Tassos Chatzithomaoglou
achatz at forthnetgroup.gr
Sat Feb 2 06:33:41 EST 2013
I have the following setup on a ASR1006 (RP2/ESP40/SIP40) and i'm trying to find out if
the following behavior is expected.
+-----+
Te1/0/0 (4G/1G) ---| |---Te1/1/0 (2G/8G)
| |
Te1/2/0 (4G/1G) ---| |
+-----+
When the output of Te1/1/0 goes above 8G, RTT for packets flowing from Te1/0/0 to Te1/2/0
increases by 50-100ms.
The same happens in the following scenario on another ASR1006 (RP2/ESP20/SIP10); when the
output of Te1/1/0 goes above 6G, RTT for packets flowing from Te1/0/0to router's loopback
increases by 50-100ms (RP is at ~30% all the time).
+-----+
Te1/0/0 (6G/1G) ---| |---Te1/1/0 (1G/6G)
| |
| |
+-----+
Most of the times, RTT increase is followed by packet loss
This reminds me of HOL blocking, but i had the impression this was applicable mostly to
switches with small buffers.
At the same time, the ESP sends thousands of flow control signals to the SIP that it can't
cope with this traffic rate.
ASR1006#show platform hardware slot 1 serdes statistics
>From Slot F0-Link A
Pkts High: 1687702827 Low: 391241384970 Bad: 0 Dropped: 0
Bytes High: 326483900940 Low: 291059939598187 Bad: 0 Dropped: 0
Pkts Looped: 0 Error: 0
Bytes Looped 0
Qstat count: 0 Flow ctrl count: 40306518521 <===
>From Slot F1-Link A
Pkts High: 0 Low: 0 Bad: 0 Dropped: 0
Bytes High: 0 Low: 0 Bad: 0 Dropped: 0
Pkts Looped: 0 Error: 0
Bytes Looped 0
Qstat count: 0 Flow ctrl count: 80093
-after 1 sec-
ASR1006#show platform hardware slot 1 serdes statistics
>From Slot F0-Link A
Pkts High: 1687721691 Low: 391244370772 Bad: 0 Dropped: 0
Bytes High: 326487553571 Low: 291062458884384 Bad: 0 Dropped: 0
Pkts Looped: 0 Error: 0
Bytes Looped 0
Qstat count: 0 Flow ctrl count: 40307432319 <===
>From Slot F1-Link A
Pkts High: 0 Low: 0 Bad: 0 Dropped: 0
Bytes High: 0 Low: 0 Bad: 0 Dropped: 0
Pkts Looped: 0 Error: 0
Bytes Looped 0
Qstat count: 0 Flow ctrl count: 80094
ASR1006#show platform hardware slot 1 plim status internal
FCM Status
XON/XOFF 0x0000000000000003
ECC Status
Data Path Config
MaxBurst1 256, MaxBurst2 128, DataMaxT 32768
Cal Length RX 0x0002, TX 0x0002
Repetitions RX 0x0010, TX 0x0010
Data Path Status
RX in sync, TX in sync
Spi4 Channel 0, Rx Channel Status Full, Tx Channel Status Hungry <===
Spi4 Channel 1, Rx Channel Status Starving, Tx Channel Status Starving
RX Pkts 391387121421 Bytes 285048619167994
TX Pkts 393073127218 Bytes 291507927293959
Hypertransport Status
RX Pkts 0 Bytes 0
TX Pkts 0 Bytes 0
TAC is talking about microbursts (how unusual), and although i can't measure 10G traffic
per ms, QFP's 5-sec data doesn't agree with them.
ASR1006#show platform hardware qfp active data utilization
CPP 0: Subdev 0 5 secs 1 min 5 min 60 min
Input: Priority (pps) 884 840 838 835
(bps) 783344 750704 746360 725664
Non-Priority (pps) 1291098 1282015 1261073 1265900
(bps) 7544814904 7465944936 7322679592 7345058240
Total (pps) 1291982 1282855 1261911 1266735
(bps) 7545598248 7466695640 7323425952 7345783904
Output: Priority (pps) 9065 9191 9184 8897
(bps) 11485520 11659776 11730880 11357888
Non-Priority (pps) 1281141 1271903 1251199 1256512
(bps) 7560289312 7481701984 7338684976 7360568224
Total (pps) 1290206 1281094 1260383 1265409
(bps) 7571774832 7493361760 7350415856 7371926112
Processing: Load (pct) 59 59 59 59
At the same time different IOS releases give different results (15.2(4)S2 is far worsethan
15.1(3)S2) and i'm starting to believe that ASR1006 is another hype scheduled to go down...
--
Tassos
More information about the cisco-nsp
mailing list