[c-nsp] All multicast punting to CPU on 6500
Christian Meutes
christian at errxtx.net
Mon Dec 17 00:26:23 EST 2012
Checkpoint clusters are based on the same poor HA forwarding design, but try to tell that firewall guys...
Back to the issue:
Broadcasts and Multicasts (non-programmed) are always handled by interrupts thus consulting CPU ressources. CoPP can't handle that (at least not on PFC3), so that you rely entirely on the availability of h/w-based rate-limiters of your PFC platform.
For example if there wasn't a HSRP rate-limiter shipped with the SX code you could quite easily kill the box with few Megs of HSRP. The same is true for all other B/Mcasts.
--
Christian
On 16.12.2012, at 23:22, Tony Varriale <tvarriale at comcast.net> wrote:
> On 12/16/2012 5:59 AM, Robert Williams wrote:
>> Hi, I'll try to go into some additional detail on the traffic and other router config elements now.
>>
>> The traffic is basically made up of a randomly generated packet which is almost identical to the below.
>>
>> The 'random' element is that the source port is different every time.
>>
>> This packet was 10.0.5.200 (00:50:56:a6:00:23) -> 10.0.5.88 (01:00:5e:7f:05:77)
>>
>> The test interface on the 6500 is currently on 10.0.5.123.
>>
>> The below packet was captured on the control-plane going towards the Route-Processor CPU.
>>
>> ----------------------------------------------------------------------------------
>> ----------------------------------------------------------------------------------
>> No. Time Source Destination Protocol Length Info
>> 23985 2023.684297 10.0.5.200 10.0.5.88 TCP 60 config-port > 0 [<None>] Seq=1 Win=512 Len=0
>>
>> Frame 23985: 60 bytes on wire (480 bits), 60 bytes captured (480 bits)
>> Arrival Time: Dec 16, 2012 11:36:32.951556000 UTC
>> Epoch Time: 1355657792.951556000 seconds
>> [Time delta from previous captured frame: 0.000300000 seconds]
>> [Time delta from previous displayed frame: 0.000300000 seconds]
>> [Time since reference or first frame: 2023.684297000 seconds]
>> Frame Number: 23985
>> Frame Length: 60 bytes (480 bits)
>> Capture Length: 60 bytes (480 bits)
>> [Frame is marked: True]
>> [Frame is ignored: False]
>> [Protocols in frame: eth:ip:tcp]
>> [Coloring Rule Name: TCP]
>> [Coloring Rule String: tcp]
>> Ethernet II, Src: Vmware_a6:00:23 (00:50:56:a6:00:23), Dst: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
>> Destination: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
>> Address: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
>> .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
>> .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
>> Source: Vmware_a6:00:23 (00:50:56:a6:00:23)
>> Address: Vmware_a6:00:23 (00:50:56:a6:00:23)
>> .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
>> .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
>> Type: IP (0x0800)
>> Trailer: 000000000000
>> Internet Protocol Version 4, Src: 10.0.5.200 (10.0.5.200), Dst: 10.0.5.88 (10.0.5.88)
>> Version: 4
>> Header length: 20 bytes
>> Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
>> 0000 00.. = Differentiated Services Codepoint: Default (0x00)
>> .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
>> Total Length: 40
>> Identification: 0x7b6e (31598)
>> Flags: 0x00
>> 0... .... = Reserved bit: Not set
>> .0.. .... = Don't fragment: Not set
>> ..0. .... = More fragments: Not set
>> Fragment offset: 0
>> Time to live: 64
>> Protocol: TCP (6)
>> Header checksum: 0xe042 [correct]
>> [Good: True]
>> [Bad: False]
>> Source: 10.0.5.200 (10.0.5.200)
>> Destination: 10.0.5.88 (10.0.5.88)
>> Transmission Control Protocol, Src Port: config-port (3577), Dst Port: 0 (0), Seq: 1, Len: 0
>> Source port: config-port (3577)
>> Destination port: 0 (0)
>> [Stream index: 3651]
>> Sequence number: 1 (relative sequence number)
>> Acknowledgement number: Broken TCP. The acknowledge field is nonzero while the ACK flag is not set
>> Header length: 20 bytes
>> Flags: 0x000 (<None>)
>> 000. .... .... = Reserved: Not set
>> ...0 .... .... = Nonce: Not set
>> .... 0... .... = Congestion Window Reduced (CWR): Not set
>> .... .0.. .... = ECN-Echo: Not set
>> .... ..0. .... = Urgent: Not set
>> .... ...0 .... = Acknowledgement: Not set
>> .... .... 0... = Push: Not set
>> .... .... .0.. = Reset: Not set
>> .... .... ..0. = Syn: Not set
>> .... .... ...0 = Fin: Not set
>> Window size value: 512
>> [Calculated window size: 512]
>> [Window size scaling factor: -1 (unknown)]
>> Checksum: 0xd021 [validation disabled]
>> [Good Checksum: False]
>> [Bad Checksum: False]
>> ----------------------------------------------------------------------------------
>> ----------------------------------------------------------------------------------
>>
>> As for the multicast configuration on the box - it doesn't run any end-user multicast services, other than VRRP/HSRP between itself and a partner 6500 (for gateway resilience).
>>
>> As such there is no multicast configuration. In fact, if anything it would be ideal if the box dropped all multicast traffic apart from the HSRP/VRRP to be honest.
>>
>> The reason I think this may be causing issues is because it is destined to a non-multicast IP, but with a multicast MAC....?
>>
>> I also tried the suggestion of disabling CoPP and the traffic was still hitting the CPU at the same rate.
>>
>> To answer the other questions, the TTL on these test packets is 64 and the router has "IP options drop" set globally. There are also rate-limits for TTL expired and all interfaces in question have "no ip unreachables" set. In fact. the test interface config is currently:
>>
>> interface Vlan10
>> ip address 10.0.5.123 255.255.255.0
>> ip access-group test in
>> no ip redirects
>> no ip unreachables
>> no ip proxy-arp
>>
>> I have also tried enabling/disabling these on the vlan interface:
>> ip pim snooping
>> ip igmp version 3
>>
>> But no impact was seen.
>>
>> There is also a test ACL I have been experimenting with to try and match the test traffic, which (after receiving 100,000 test packets) shows the following:
>>
>> Extended IP access list test
>> 10 deny ip host 10.0.5.200 any (9 matches)
>> 20 deny ip any host 10.0.5.88
>> 30 deny ip any 224.0.0.0 0.15.255.255 (4 matches)
>> 1000 permit ip any any (504 matches)
>>
>> So even though I've specifically matched the traffic source and destination IPs, I'm not getting matches or drops.
>>
>> (The "permit ip any any" is matching other random traffic we have on that test network at the moment and increments normally without the test packets)
>>
>> Some additional background info:
>>
>> The situation arose in the real world when a Windows NLB cluster went offline and there was a load of traffic heading to its shared IPv4 address ('not' a multicast IP, but 'is' a multicast MAC) - so the switch flooded to all ports, including the 6500 upstream, triggering high CPU.
>>
>> Thanks again!
>>
>>
>>
>>
>> Robert Williams
>> Custodian Data Centre
>> Email: Robert at CustodianDC.com
>> http://www.CustodianDC.com
>>
>>
>> Robert Williams
>> Backline / Operations Team
>> Custodian DataCentre
>> tel: +44 (0)1622 230382
>> email: Robert at CustodianDC.com
>> http://www.custodiandc.com/disclaimer.txt
>>
>> -----Original Message-----
>> From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Phil Mayers
>> Sent: 16 December 2012 11:26
>> To: cisco-nsp at puck.nether.net
>> Subject: Re: [c-nsp] All multicast punting to CPU on 6500
>>
>> I think the implication is that it's possible for a CoPP policy to prevent the forwarding hardware "seeing" the multicast and installing the hardware shortcuts to drop uninteresting traffic.
>>
>> You might try disabling CoPP to see if that changes things.
>>
>> You weren't very specific about the type of multicast traffic and the multicast config on the box. I'm going to guess it's IPv4/IPv6 multicast from the MAC addresses, but is the 6500 configured for multicast routing, and is it enabled on that interface? If so, what does "sh ip mr <the group>" say?
>>
>> I assume you've eliminated the really obviously things like TTL=1 and IP options / special packet stuff?
> This covers the issue well.
>
> http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml
>
> Highly recommended to stay away from MS NLB. It's been designed poorly for over 7 years (that I know of personally).
>
> I think your test is invalid. You should come up with a real use case(s). In the world of networking, most of us can come up with tests that would crush and boggle any box.
>
> tv
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
More information about the cisco-nsp
mailing list