[c-nsp] All multicast punting to CPU on 6500
Tony Varriale
tvarriale at comcast.net
Sun Dec 16 11:22:51 EST 2012
On 12/16/2012 5:59 AM, Robert Williams wrote:
> Hi, I'll try to go into some additional detail on the traffic and other router config elements now.
>
> The traffic is basically made up of a randomly generated packet which is almost identical to the below.
>
> The 'random' element is that the source port is different every time.
>
> This packet was 10.0.5.200 (00:50:56:a6:00:23) -> 10.0.5.88 (01:00:5e:7f:05:77)
>
> The test interface on the 6500 is currently on 10.0.5.123.
>
> The below packet was captured on the control-plane going towards the Route-Processor CPU.
>
> ----------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------
> No. Time Source Destination Protocol Length Info
> 23985 2023.684297 10.0.5.200 10.0.5.88 TCP 60 config-port > 0 [<None>] Seq=1 Win=512 Len=0
>
> Frame 23985: 60 bytes on wire (480 bits), 60 bytes captured (480 bits)
> Arrival Time: Dec 16, 2012 11:36:32.951556000 UTC
> Epoch Time: 1355657792.951556000 seconds
> [Time delta from previous captured frame: 0.000300000 seconds]
> [Time delta from previous displayed frame: 0.000300000 seconds]
> [Time since reference or first frame: 2023.684297000 seconds]
> Frame Number: 23985
> Frame Length: 60 bytes (480 bits)
> Capture Length: 60 bytes (480 bits)
> [Frame is marked: True]
> [Frame is ignored: False]
> [Protocols in frame: eth:ip:tcp]
> [Coloring Rule Name: TCP]
> [Coloring Rule String: tcp]
> Ethernet II, Src: Vmware_a6:00:23 (00:50:56:a6:00:23), Dst: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
> Destination: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
> Address: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
> .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
> .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
> Source: Vmware_a6:00:23 (00:50:56:a6:00:23)
> Address: Vmware_a6:00:23 (00:50:56:a6:00:23)
> .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
> .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
> Type: IP (0x0800)
> Trailer: 000000000000
> Internet Protocol Version 4, Src: 10.0.5.200 (10.0.5.200), Dst: 10.0.5.88 (10.0.5.88)
> Version: 4
> Header length: 20 bytes
> Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
> 0000 00.. = Differentiated Services Codepoint: Default (0x00)
> .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
> Total Length: 40
> Identification: 0x7b6e (31598)
> Flags: 0x00
> 0... .... = Reserved bit: Not set
> .0.. .... = Don't fragment: Not set
> ..0. .... = More fragments: Not set
> Fragment offset: 0
> Time to live: 64
> Protocol: TCP (6)
> Header checksum: 0xe042 [correct]
> [Good: True]
> [Bad: False]
> Source: 10.0.5.200 (10.0.5.200)
> Destination: 10.0.5.88 (10.0.5.88)
> Transmission Control Protocol, Src Port: config-port (3577), Dst Port: 0 (0), Seq: 1, Len: 0
> Source port: config-port (3577)
> Destination port: 0 (0)
> [Stream index: 3651]
> Sequence number: 1 (relative sequence number)
> Acknowledgement number: Broken TCP. The acknowledge field is nonzero while the ACK flag is not set
> Header length: 20 bytes
> Flags: 0x000 (<None>)
> 000. .... .... = Reserved: Not set
> ...0 .... .... = Nonce: Not set
> .... 0... .... = Congestion Window Reduced (CWR): Not set
> .... .0.. .... = ECN-Echo: Not set
> .... ..0. .... = Urgent: Not set
> .... ...0 .... = Acknowledgement: Not set
> .... .... 0... = Push: Not set
> .... .... .0.. = Reset: Not set
> .... .... ..0. = Syn: Not set
> .... .... ...0 = Fin: Not set
> Window size value: 512
> [Calculated window size: 512]
> [Window size scaling factor: -1 (unknown)]
> Checksum: 0xd021 [validation disabled]
> [Good Checksum: False]
> [Bad Checksum: False]
> ----------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------
>
> As for the multicast configuration on the box - it doesn't run any end-user multicast services, other than VRRP/HSRP between itself and a partner 6500 (for gateway resilience).
>
> As such there is no multicast configuration. In fact, if anything it would be ideal if the box dropped all multicast traffic apart from the HSRP/VRRP to be honest.
>
> The reason I think this may be causing issues is because it is destined to a non-multicast IP, but with a multicast MAC....?
>
> I also tried the suggestion of disabling CoPP and the traffic was still hitting the CPU at the same rate.
>
> To answer the other questions, the TTL on these test packets is 64 and the router has "IP options drop" set globally. There are also rate-limits for TTL expired and all interfaces in question have "no ip unreachables" set. In fact. the test interface config is currently:
>
> interface Vlan10
> ip address 10.0.5.123 255.255.255.0
> ip access-group test in
> no ip redirects
> no ip unreachables
> no ip proxy-arp
>
> I have also tried enabling/disabling these on the vlan interface:
> ip pim snooping
> ip igmp version 3
>
> But no impact was seen.
>
> There is also a test ACL I have been experimenting with to try and match the test traffic, which (after receiving 100,000 test packets) shows the following:
>
> Extended IP access list test
> 10 deny ip host 10.0.5.200 any (9 matches)
> 20 deny ip any host 10.0.5.88
> 30 deny ip any 224.0.0.0 0.15.255.255 (4 matches)
> 1000 permit ip any any (504 matches)
>
> So even though I've specifically matched the traffic source and destination IPs, I'm not getting matches or drops.
>
> (The "permit ip any any" is matching other random traffic we have on that test network at the moment and increments normally without the test packets)
>
> Some additional background info:
>
> The situation arose in the real world when a Windows NLB cluster went offline and there was a load of traffic heading to its shared IPv4 address ('not' a multicast IP, but 'is' a multicast MAC) - so the switch flooded to all ports, including the 6500 upstream, triggering high CPU.
>
> Thanks again!
>
>
>
>
> Robert Williams
> Custodian Data Centre
> Email: Robert at CustodianDC.com
> http://www.CustodianDC.com
>
>
> Robert Williams
> Backline / Operations Team
> Custodian DataCentre
> tel: +44 (0)1622 230382
> email: Robert at CustodianDC.com
> http://www.custodiandc.com/disclaimer.txt
>
> -----Original Message-----
> From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Phil Mayers
> Sent: 16 December 2012 11:26
> To: cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] All multicast punting to CPU on 6500
>
> I think the implication is that it's possible for a CoPP policy to prevent the forwarding hardware "seeing" the multicast and installing the hardware shortcuts to drop uninteresting traffic.
>
> You might try disabling CoPP to see if that changes things.
>
> You weren't very specific about the type of multicast traffic and the multicast config on the box. I'm going to guess it's IPv4/IPv6 multicast from the MAC addresses, but is the 6500 configured for multicast routing, and is it enabled on that interface? If so, what does "sh ip mr <the group>" say?
>
> I assume you've eliminated the really obviously things like TTL=1 and IP options / special packet stuff?
>
>
This covers the issue well.
http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml
Highly recommended to stay away from MS NLB. It's been designed poorly
for over 7 years (that I know of personally).
I think your test is invalid. You should come up with a real use
case(s). In the world of networking, most of us can come up with tests
that would crush and boggle any box.
tv
More information about the cisco-nsp
mailing list