[c-nsp] tcp 179 traffic causing high cpu on 3750/3560
Dan Armstrong
dan at beanfield.com
Tue Mar 20 22:16:23 EST 2007
We've had high(ish) interrupt issues on 3550s, and ME3400s causing
customers on these switches to complain about "slow" service - and I've
never been able to really nail it down.
I'm definitely going to keep my eye on this.
Calin VELEA wrote:
> Hello cisco-nsp,
>
> I came across a special case of high CPU usage
> involving 3750s/3560s, and I was wondering if
> anyone has seen this happening before.
>
> I am using 3750Gs to route traffic
> in an ISP environment. They are used with
> "sdm prefer routing", and hold around 3000
> bgp+ospf routes (which is well within the
> TCAM limits).
>
> At various times, I noticed temporary
> CPU usage spikes from 5-6% to 70-80%
> which lasted random amounts of time.
> During these intervals, telnet access
> to the switch CLI was noticeably
> slower. Since the IP Input process went
> always on top and the increase in CPU
> was due mostly to interrupts, I ran:
>
> sh buffers input-interface
>
> to get a look at the packets that were being
> routed in software.
>
> It turned out that all the packets I could
> capture this way during the high CPU period,
> had all in common TCP source or destination
> port 179 (bgp).
>
> Printing the packet path using
> "sh platform forward", showed indeed that any TCP
> src/dst port 179 traffic was being forwarded to
> the CPU queue and routed in software, regardless
> of the source/dst IP. So not only bgp traffic TO
> the switch, but any port 179 traffic routed
> THROUGH the switch
>
> As far as i can see in the output of "sh platform
> forward", the cause seems to be some ACLs hardcoded
> in the switch which direct bgp traffic to the CPU queue.
>
> I ran
>
> "sh platform tcam table acl index 8081 detail" and
> "sh platform tcam table acl index 8096 detail"
>
> on several 3750Gs/3560Gs with various IOS versions
> and the ACLs were always present even if there weren't
> any ACLs defined in the config.
>
>
> Below, you can see one of the ACLs which matches on
> "l4Destination value 0xB3 mask 0xFFFF" and redirects
> traffic to an index which "sh platform forward" identifies
> as the CPU queue.
>
> Core-Metro#sh platform tcam table acl index 8081 detail
>
> =============================================================================
> ACL Cam Table (#entries: 8192, startIndex: 20864)
>
> Index ACL CAM Table ACL
> -----------------------------------------------------------------------------
> mask-> F8_00000000_00000000-FF_C500FFFF_00000000
> 8081 40_00000000_00000000-00_810000B3_00000000 02001F6C
>
> l3CamInputAclDescriptor Value Mask
> -----------------------------------------------------------------------------
> lookupType: 4 F
> cos: 0 0
> inputAclType: 0 1
> l3Destination: 00.00.00.00 00.00.00.00
> l3Source: 00.00.00.00 00.00.00.00
> inputAclLabel: 0 FF
> l4Select: 2 3
> l3DontFragment: 0 0
> l3MoreFragments: 0 0
> l3SmallOffset: 0 0
> l3NotFirstFragment: 0 1
> l3ReservedFlag: 0 0
> l2Bcast: 1 1
> l3Dscp: 0 0
> l3TosReserved: 0 0
> l4Destination: B3 FFFF
> l4Source: 0 0
> l4Map: 0 0
>
> AclDescriptor
> -----------------------------------------------------------------------------
> aclStatisticsIndex: 2
> aclLogIndex: 0
> denyRoute: 0
> denyBridge: 0
> spanDest0En: 0
> spanDest1En: 0
> redirectIndex: 1F6C
>
>
> I guess this can be exploited to keep the CPU usage high on 3750s,
> by just passing this kind of traffic through. However it seems that
> there is a limit on CPU interrupt usage since I haven't seen this
> going over 80% so far.
>
>
>
More information about the cisco-nsp
mailing list