[c-nsp] performance problems / overruns on a 6500/sup720/dfc's
bas
kilobit at gmail.com
Thu Jul 23 17:53:58 EDT 2009
Hello All,
I hope you guys can help me with the following issue.
It started a couple of weeks ago when one customer reported degraded
performance.
The customer has ~30 servers on a WS-C3750E-48TD, which in turn has a
single 10GE link to the 6500 in question.
The 10GE link on the 6500 has a service policy configured to limit IP
traffic to 8Gbps. (via an aggregate-policer)
Before the problems started the customer was able to push 8Gbps on the
link for 16 hours a day, the remaining time the customer has less
visitors to their service.
The issue arises every day at a time the router starts to forward 7.5
- 8Mpps. (approx 50Gbps)
When that moment comes the interface facing the customer drops down to
5 - 6 Gbps.
In the interface counters we can see the number of overruns increases very fast.
This continues till about 23:00PM when the total traffic forwarded
drops below 8mpps.
mod1: WS-X6708-10GE
mod2: WS-X6748-SFP
mod3: WS-X6704-10GE
mod4: WS-X6748-GE-TX
mod5: WS-X6748-GE-TX
mod6: WS-SUP720-3BXL
Initially running 12.2(18)SXF15a
Currently running 12.2(33)SXI1
The customer was connected to Te1/7 and currently 3/2
Things we have investigated or changed. (all have not resolved the issue)
- We saw through "sh plat hard cap fab" that some of the fabric
channels were (nearly) congested.
We swapped around a couple of TenG interfaces between channels and
slots 1 and 3.
- We suspected possible relation to Cisco bugs CSCeh08451 or
CSCsl70634. Even though both are resolved in SXF12 we upgraded to SXI1
- Possibly hitting some bottleneck in PFC/fabric, so we upgraded
modules 2 and 3 (the heaviest utilized modules) with DFC-3BXL.
- Tried different hold-queue's in and out
- Several fabric buffer-reserve settings
- Disabling all netflow
- removing the policy-map(s)
- enabling/disabling send/receive flowcontrol on several ports and
also on the customer 3750.
More customers are noticing degraded performance. Lower speeds and 5 -
20% packetloss.
The router has enough memory available, SP and RP cpu's are always below 30%
Below sh int output of the first customer that reported issues.
TenGigabitEthernet3/2 is up, line protocol is up (connected)
Hardware is C6k 10000Mb 802.3, address is 000f.35bb.0b40 (bia 000f.35bb.0b40)
Description: XXX001 - MO08
Internet address is xx.xx.240.126/26
MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec,
reliability 255/255, txload 6/255, rxload 202/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 10Gb/s, media type is 10Gbase-LR
input flow-control is off, output flow-control is off
ARP type: ARPA, ARP Timeout 00:30:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters 00:56:37
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 7935497000 bits/sec, 665152 packets/sec
30 second output rate 239985000 bits/sec, 438880 packets/sec
L2 Switched: ucast: 32 pkt, 2048 bytes - mcast: 1052 pkt, 318283 bytes
L3 in Switched: ucast: 2016175646 pkt, 2998867098833 bytes - mcast:
0 pkt, 0 bytes mcast
L3 out Switched: ucast: 1483531972 pkt, 115723597149 bytes mcast: 0
pkt, 0 bytes
2228491744 packets input, 3314752535506 bytes, 0 no buffer
Received 3005 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 206532318 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
0 input packets with dribble condition detected
1482844739 packets output, 115625721402 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
As you can see no problems reported other than overruns (approx 10%)
sh plat hard cap for output:
Forwarding engine load:
Module pps peak-pps peak-time
1 2852591 4416215 18:21:12 CEST Thu Jul 23 2009
2 1422180 1645505 22:42:03 CEST Thu Jul 23 2009
3 903195 1018577 11:28:05 CEST Wed Jul 22 2009
6 1756281 8244268 01:36:29 CEST Sat Jul 18 2009
We're pretty much stuck.
Thanks for reading if you've gotten this far.
Any help would be very appreciated.
Kind regards,
Bas
p.s. the box peaks at approx 35Mbps IPv6 traffic, that shouldn't
affect IPv4 forwarding performance right?
More information about the cisco-nsp
mailing list