[c-nsp] All multicast punting to CPU on 6500

Tony Varriale tvarriale at comcast.net
Sun Dec 16 11:22:51 EST 2012


On 12/16/2012 5:59 AM, Robert Williams wrote:
> Hi, I'll try to go into some additional detail on the traffic and other router config elements now.
>
> The traffic is basically made up of a randomly generated packet which is almost identical to the below.
>
> The 'random' element is that the source port is different every time.
>
> This packet was 10.0.5.200 (00:50:56:a6:00:23) -> 10.0.5.88 (01:00:5e:7f:05:77)
>
> The test interface on the 6500 is currently on 10.0.5.123.
>
> The below packet was captured on the control-plane going towards the Route-Processor CPU.
>
> ----------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------
> No.     Time        Source                Destination           Protocol Length Info
>    23985 2023.684297 10.0.5.200            10.0.5.88             TCP      60     config-port > 0 [<None>] Seq=1 Win=512 Len=0
>
> Frame 23985: 60 bytes on wire (480 bits), 60 bytes captured (480 bits)
>      Arrival Time: Dec 16, 2012 11:36:32.951556000 UTC
>      Epoch Time: 1355657792.951556000 seconds
>      [Time delta from previous captured frame: 0.000300000 seconds]
>      [Time delta from previous displayed frame: 0.000300000 seconds]
>      [Time since reference or first frame: 2023.684297000 seconds]
>      Frame Number: 23985
>      Frame Length: 60 bytes (480 bits)
>      Capture Length: 60 bytes (480 bits)
>      [Frame is marked: True]
>      [Frame is ignored: False]
>      [Protocols in frame: eth:ip:tcp]
>      [Coloring Rule Name: TCP]
>      [Coloring Rule String: tcp]
> Ethernet II, Src: Vmware_a6:00:23 (00:50:56:a6:00:23), Dst: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
>      Destination: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
>          Address: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77)
>          .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
>          .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
>      Source: Vmware_a6:00:23 (00:50:56:a6:00:23)
>          Address: Vmware_a6:00:23 (00:50:56:a6:00:23)
>          .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
>          .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
>      Type: IP (0x0800)
>      Trailer: 000000000000
> Internet Protocol Version 4, Src: 10.0.5.200 (10.0.5.200), Dst: 10.0.5.88 (10.0.5.88)
>      Version: 4
>      Header length: 20 bytes
>      Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
>          0000 00.. = Differentiated Services Codepoint: Default (0x00)
>          .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
>      Total Length: 40
>      Identification: 0x7b6e (31598)
>      Flags: 0x00
>          0... .... = Reserved bit: Not set
>          .0.. .... = Don't fragment: Not set
>          ..0. .... = More fragments: Not set
>      Fragment offset: 0
>      Time to live: 64
>      Protocol: TCP (6)
>      Header checksum: 0xe042 [correct]
>          [Good: True]
>          [Bad: False]
>      Source: 10.0.5.200 (10.0.5.200)
>      Destination: 10.0.5.88 (10.0.5.88)
> Transmission Control Protocol, Src Port: config-port (3577), Dst Port: 0 (0), Seq: 1, Len: 0
>      Source port: config-port (3577)
>      Destination port: 0 (0)
>      [Stream index: 3651]
>      Sequence number: 1    (relative sequence number)
>      Acknowledgement number: Broken TCP. The acknowledge field is nonzero while the ACK flag is not set
>      Header length: 20 bytes
>      Flags: 0x000 (<None>)
>          000. .... .... = Reserved: Not set
>          ...0 .... .... = Nonce: Not set
>          .... 0... .... = Congestion Window Reduced (CWR): Not set
>          .... .0.. .... = ECN-Echo: Not set
>          .... ..0. .... = Urgent: Not set
>          .... ...0 .... = Acknowledgement: Not set
>          .... .... 0... = Push: Not set
>          .... .... .0.. = Reset: Not set
>          .... .... ..0. = Syn: Not set
>          .... .... ...0 = Fin: Not set
>      Window size value: 512
>      [Calculated window size: 512]
>      [Window size scaling factor: -1 (unknown)]
>      Checksum: 0xd021 [validation disabled]
>          [Good Checksum: False]
>          [Bad Checksum: False]
> ----------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------
>
> As for the multicast configuration on the box - it doesn't run any end-user multicast services, other than VRRP/HSRP between itself and a partner 6500 (for gateway resilience).
>
> As such there is no multicast configuration. In fact, if anything it would be ideal if the box dropped all multicast traffic apart from the HSRP/VRRP to be honest.
>
> The reason I think this may be causing issues is because it is destined to a non-multicast IP, but with a multicast MAC....?
>
> I also tried the suggestion of disabling CoPP and the traffic was still hitting the CPU at the same rate.
>
> To answer the other questions, the TTL on these test packets is 64 and the router has "IP options drop" set globally. There are also rate-limits for TTL expired and all interfaces in question have "no ip unreachables" set. In fact. the test interface config is currently:
>
> interface Vlan10
>   ip address 10.0.5.123 255.255.255.0
>   ip access-group test in
>   no ip redirects
>   no ip unreachables
>   no ip proxy-arp
>
> I have also tried enabling/disabling these on the vlan interface:
>   ip pim snooping
>   ip igmp version 3
>
> But no impact was seen.
>
> There is also a test ACL I have been experimenting with to try and match the test traffic, which (after receiving 100,000 test packets) shows the following:
>
> Extended IP access list test
>      10 deny ip host 10.0.5.200 any (9 matches)
>      20 deny ip any host 10.0.5.88
>      30 deny ip any 224.0.0.0 0.15.255.255 (4 matches)
>      1000 permit ip any any (504 matches)
>
> So even though I've specifically matched the traffic source and destination IPs, I'm not getting matches or drops.
>
> (The "permit ip any any" is matching other random traffic we have on that test network at the moment and increments normally without the test packets)
>
> Some additional background info:
>
> The situation arose in the real world when a Windows NLB cluster went offline and there was a load of traffic heading to its shared IPv4 address ('not' a multicast IP, but 'is' a multicast MAC) - so the switch flooded to all ports, including the 6500 upstream, triggering high CPU.
>
> Thanks again!
>
>
>
>
> Robert Williams
> Custodian Data Centre
> Email: Robert at CustodianDC.com
> http://www.CustodianDC.com
>
>
> Robert Williams
> Backline / Operations Team
> Custodian DataCentre
> tel: +44 (0)1622 230382
> email: Robert at CustodianDC.com
> http://www.custodiandc.com/disclaimer.txt
>
> -----Original Message-----
> From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Phil Mayers
> Sent: 16 December 2012 11:26
> To: cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] All multicast punting to CPU on 6500
>
> I think the implication is that it's possible for a CoPP policy to prevent the forwarding hardware "seeing" the multicast and installing the hardware shortcuts to drop uninteresting traffic.
>
> You might try disabling CoPP to see if that changes things.
>
> You weren't very specific about the type of multicast traffic and the multicast config on the box. I'm going to guess it's IPv4/IPv6 multicast from the MAC addresses, but is the 6500 configured for multicast routing, and is it enabled on that interface? If so, what does "sh ip mr <the group>" say?
>
> I assume you've eliminated the really obviously things like TTL=1 and IP options / special packet stuff?
>
>
This covers the issue well.

http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml

Highly recommended to stay away from MS NLB.  It's been designed poorly 
for over 7 years (that I know of personally).

I think your test is invalid.  You should come up with a real use 
case(s).  In the world of networking, most of us can come up with tests 
that would crush and boggle any box.

tv


More information about the cisco-nsp mailing list