[c-nsp] low throughput of 10GbE line
Nick Hilliard
nick at inex.ie
Mon Mar 15 17:17:24 EDT 2010
On 15/03/2010 16:26, Jirí Procházka wrote:
> When traffic on this link reaches aproximately 6Gbps, latence to servers
> gets rapidly worse (about 100-150ms, about 2ms before)
the 6708 card has 200 megs of buffers per port. doing the sums, this works
out at about 160ms of latency, assuming you're seeing a 10Gb microburst.
So at a superficial level, it looks like you're seeing packet drops because
of full buffers.
Also, you're running the card in oversubscription mode. How much traffic
is te4/7 pushing? I'd hazard a guess that you're running into
over-subscription problems on the blade.
I can't find the more detailed guide to the 6708 architecture on the cisco
web site, but there's a brief overview here:
> http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_paper0900aecd80673385.html#wp9000681
while it's not going to give you exact details on tiny microbursts, I'd
consider installing RTG or YRTG with a 30 second poll interval, and monitor
all the ports on blade 4, along with the following aggregates.
te4/1 + te4/2
te4/3 + te4/4
te4/5 + te4/6
te4/7 + te4/8
te4/1 + te4/2 + te4/3 + te4/4
te4/5 + te4/6 + te4/7 + te4/8
Do you have a second 6708 blade? You may need to consider running these
ports in non-oversubscribed mode.
Nick
and speed is
> unpredictably slowing. Servers are able to generate much more than
> 10Gbps. I have tried to assign IP from this VLAN directly to
> vlan-interface at 3750 and latence is bad as well.
>
>
> The two problems which I can see at 7606 are following:
>
> 1) Input queue drops at the interface. They appear at the same time as
> the high latence. I tried to set lower hold-queue, but no difference.
> Any type of qos or other bandwidht limiting methods are applied.
>
> sitel-edge-new#show int te4/8
> TenGigabitEthernet4/8 is up, line protocol is up (connected)
> Hardware is C7600 10Gb 802.3, address is 001e.f7f7.bd5f (bia
> 001e.f7f7.bd5f)
> Description: SITEL-TTC-New10GbE
> MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec,
> reliability 255/255, txload 3/255, rxload 126/255
> Encapsulation ARPA, loopback not set
> Keepalive set (10 sec)
> Full-duplex, 10Gb/s
> Transport mode LAN (10GBASE-R, 10.3125Gb/s)
> input flow-control is off, output flow-control is off
> ARP type: ARPA, ARP Timeout 04:00:00
> Last input 00:00:07, output 00:00:41, output hang never
> Last clearing of "show interface" counters 18:56:25
> Input queue: 0/4096/96151289/0 (size/max/drops/flushes); Total output
> drops: 0
> Queueing strategy: fifo
> Output queue: 0/4096 (size/max)
> 30 second input rate 4942942000 bits/sec, 410813 packets/sec
> 30 second output rate 143433000 bits/sec, 241176 packets/sec
> 34434311308 packets input, 51018973201607 bytes, 0 no buffer
> Received 21472 broadcasts (17607 multicasts)
> 0 runts, 0 giants, 0 throttles
> 0 input errors, 0 CRC, 0 frame, 96151289 overrun, 0 ignored
> 0 watchdog, 0 multicast, 0 pause input
> 0 input packets with dribble condition detected
> 19623750753 packets output, 3094225873410 bytes, 0 underruns
> 0 output errors, 0 collisions, 0 interface resets
> 0 babbles, 0 late collision, 0 deferred
> 0 lost carrier, 0 no carrier, 0 pause output
> 0 output buffer failures, 0 output buffers swapped out
>
>
> the second side of line looks ok
>
> TTC-3750-MAIN#show int te2/0/2
> TenGigabitEthernet2/0/2 is up, line protocol is up (connected)
> Hardware is Ten Gigabit Ethernet, address is 001e.7a4f.fb9e (bia
> 001e.7a4f.fb9e)
> Description: TTC-SITEL-New10GbE
> MTU 1600 bytes, BW 10000000 Kbit, DLY 10 usec,
> reliability 255/255, txload 129/255, rxload 3/255
> Encapsulation ARPA, loopback not set
> Keepalive not set
> Full-duplex, 10Gb/s, link type is auto, media type is 10GBase-SR
> Media-type configured as connector
> input flow-control is off, output flow-control is unsupported
> ARP type: ARPA, ARP Timeout 04:00:00
> Last input 00:00:51, output 00:00:16, output hang never
> Last clearing of "show interface" counters 01:36:19
> Input queue: 0/4096/0/0 (size/max/drops/flushes); Total output drops: 0
> Queueing strategy: fifo
> Output queue: 0/4096 (size/max)
> 30 second input rate 149330000 bits/sec, 249506 packets/sec
> 30 second output rate 5062720000 bits/sec, 420886 packets/sec
> 1400794396 packets input, 104446053346 bytes, 0 no buffer
> Received 0 broadcasts (1250 multicasts)
> 0 runts, 0 giants, 0 throttles
> 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
> 0 watchdog, 1250 multicast, 0 pause input
> 0 input packets with dribble condition detected
> 2427526833 packets output, 3649915537744 bytes, 0 underruns
> 0 output errors, 0 collisions, 0 interface resets
> 0 babbles, 0 late collision, 0 deferred
> 0 lost carrier, 0 no carrier, 0 PAUSE output
> 0 output buffer failures, 0 output buffers swapped out
>
>
>
>
> 2) It looks that concerned transciever is a little bit overheated..but I
> don't trust these sensors much..
>
> sitel-edge-new#show interfaces transceiver
> Transceiver monitoring is disabled for all interfaces.
>
> Optical Optical
> Temperature Voltage Current Tx Power Rx Power
> Port (Celsius) (Volts) (mA) (dBm) (dBm)
> --------- ----------- ------- -------- -------- --------
> Te4/1 39.3 0.00 36.8 -2.8 -1.0
> Te4/2 34.6 0.00 43.1 -2.6 -3.3
> Te4/3 36.8 0.00 29.7 -2.5 -1.0
> Te4/4 35.3 0.00 5.9 -1.9 -1.7
> Te4/5 44.1 0.00 45.4 -2.0 -7.7
> Te4/6 40.0 0.00 36.0 -3.4 -2.2
> Te4/7 40.8 0.00 34.0 -3.3 -0.5 +
> Te4/8 71.9 + 0.00 6.0 -3.2 -3.4
>
>
>
>
> some more debug info:
>
> 7606 ->
>
> sitel-edge-new#show platform hardware capacity fabric
> Switch Fabric Resources
> Bus utilization: current: 35%, peak was 47% at 19:53:03 CET Thu Mar 11
> 2010
> Fabric utilization: Ingress Egress
> Module Chanl Speed rate peak rate peak
> 1 0 20G 14% 21% @18:26 11Mar10 19% 31% @18:20
> 12Mar10
> 2 0 20G 25% 39% @17:50 11Mar10 2% 10% @02:07
> 12Mar10
> 2 1 20G 10% 26% @19:41 11Mar10 33% 49% @18:06
> 11Mar10
> 4 0 20G 25% 63% @13:00 14Mar10 5% 20% @18:15
> 11Mar10
> 4 1 20G 45% 79% @18:07 11Mar10 50% 73% @09:43
> 13Mar10
> 5 0 20G 2% 5% @14:22 12Mar10 12% 19% @19:08
> 11Mar10
> Switching mode: Module Switching
> mode
> 1 truncated
> 2 truncated
> 4 compact
> 5 flow
> through
>
>
>
>
> I'm going to replace the "overheated" transciever in 7606 this night and
> hope it's the solution..but don't trust it much.
>
>
> Any advice would be really appreciated!
>
>
> Best regards,
>
>
> Jiri Prochazka
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
--
Network Ability Ltd. | Head of Operations | Tel: +353 1 6169698
3 Westland Square | INEX - Internet Neutral | Fax: +353 1 6041981
Dublin 2, Ireland | Exchange Association | Email: nick at inex.ie
More information about the cisco-nsp
mailing list