[c-nsp] tcp 179 traffic causing high cpu on 3750/3560

Dan Armstrong dan at beanfield.com
Tue Mar 20 22:16:23 EST 2007


We've had high(ish) interrupt issues on 3550s, and ME3400s causing 
customers on these switches to complain about "slow" service - and I've 
never been able to really nail it down.

I'm definitely going to keep my eye on this.





Calin VELEA wrote:
> Hello cisco-nsp,
>
>    I came across a special case of high CPU usage
> involving 3750s/3560s, and I was wondering if
> anyone has seen this happening before.
>
>    I am using 3750Gs to route traffic
> in an ISP environment. They are used with
> "sdm prefer routing", and hold around 3000
> bgp+ospf routes (which is well within the
> TCAM limits).
>
>    At various times, I noticed temporary 
> CPU usage spikes from 5-6% to 70-80% 
> which lasted random amounts of time. 
> During these intervals, telnet access 
> to the switch CLI was noticeably 
> slower. Since the IP Input process went
> always on top and the increase in CPU 
> was due mostly to interrupts, I ran:
>
> sh buffers input-interface 
>
> to get a look at the packets that were being
> routed in software.
>
>    It turned out that all the packets I could
> capture this way during the high CPU period,
> had all in common TCP source or destination
> port 179 (bgp).
>
>   Printing  the packet path using
> "sh platform forward", showed indeed that any TCP
> src/dst port 179 traffic was being forwarded to
> the CPU queue and routed in software, regardless
> of the source/dst IP. So not only bgp traffic TO 
> the switch, but any port 179 traffic routed
> THROUGH the switch
>
>   As far as i can see in the output of "sh platform
> forward", the cause seems to be some ACLs hardcoded
> in the switch which direct bgp traffic to the CPU queue.
>
> I ran
>
> "sh platform tcam table acl index 8081 detail" and
> "sh platform tcam table acl index 8096 detail"
>
> on several 3750Gs/3560Gs with various IOS versions
> and the ACLs were always present even if there weren't
> any ACLs defined in the config.
>
>
>   Below, you can see one of the ACLs which matches on
> "l4Destination   value 0xB3 mask 0xFFFF" and redirects
> traffic to an index which "sh platform forward" identifies
> as the CPU queue.
>
> Core-Metro#sh platform tcam table acl index 8081 detail
>
> =============================================================================
> ACL Cam Table (#entries: 8192, startIndex: 20864)
>
> Index  ACL CAM Table                               ACL
> -----------------------------------------------------------------------------
> mask-> F8_00000000_00000000-FF_C500FFFF_00000000
> 8081   40_00000000_00000000-00_810000B3_00000000   02001F6C
>
>   l3CamInputAclDescriptor       Value               Mask
> -----------------------------------------------------------------------------
>   lookupType:                   4                   F
>   cos:                          0                   0
>   inputAclType:                 0                   1
>   l3Destination:                00.00.00.00         00.00.00.00
>   l3Source:                     00.00.00.00         00.00.00.00
>   inputAclLabel:                0                   FF
>   l4Select:                     2                   3
>   l3DontFragment:               0                   0
>   l3MoreFragments:              0                   0
>   l3SmallOffset:                0                   0
>   l3NotFirstFragment:           0                   1
>   l3ReservedFlag:               0                   0
>   l2Bcast:                      1                   1
>   l3Dscp:                       0                   0
>   l3TosReserved:                0                   0
>   l4Destination:                B3                  FFFF
>   l4Source:                     0                   0
>   l4Map:                        0                   0
>
>   AclDescriptor
> -----------------------------------------------------------------------------
>   aclStatisticsIndex:           2
>   aclLogIndex:                  0
>   denyRoute:                    0
>   denyBridge:                   0
>   spanDest0En:                  0
>   spanDest1En:                  0
>   redirectIndex:                1F6C
>
>
>   I guess this can be exploited to keep the CPU usage high on 3750s,
> by just passing this kind of traffic through. However it seems that
> there is a limit on CPU interrupt usage since I haven't seen this
> going over 80% so far.
>   
>   
>   



More information about the cisco-nsp mailing list