[c-nsp] tcp 179 traffic causing high cpu on 3750/3560
Calin VELEA
vcalinus at hertza.ro
Tue Mar 20 21:13:52 EST 2007
Hello cisco-nsp,
I came across a special case of high CPU usage
involving 3750s/3560s, and I was wondering if
anyone has seen this happening before.
I am using 3750Gs to route traffic
in an ISP environment. They are used with
"sdm prefer routing", and hold around 3000
bgp+ospf routes (which is well within the
TCAM limits).
At various times, I noticed temporary
CPU usage spikes from 5-6% to 70-80%
which lasted random amounts of time.
During these intervals, telnet access
to the switch CLI was noticeably
slower. Since the IP Input process went
always on top and the increase in CPU
was due mostly to interrupts, I ran:
sh buffers input-interface
to get a look at the packets that were being
routed in software.
It turned out that all the packets I could
capture this way during the high CPU period,
had all in common TCP source or destination
port 179 (bgp).
Printing the packet path using
"sh platform forward", showed indeed that any TCP
src/dst port 179 traffic was being forwarded to
the CPU queue and routed in software, regardless
of the source/dst IP. So not only bgp traffic TO
the switch, but any port 179 traffic routed
THROUGH the switch
As far as i can see in the output of "sh platform
forward", the cause seems to be some ACLs hardcoded
in the switch which direct bgp traffic to the CPU queue.
I ran
"sh platform tcam table acl index 8081 detail" and
"sh platform tcam table acl index 8096 detail"
on several 3750Gs/3560Gs with various IOS versions
and the ACLs were always present even if there weren't
any ACLs defined in the config.
Below, you can see one of the ACLs which matches on
"l4Destination value 0xB3 mask 0xFFFF" and redirects
traffic to an index which "sh platform forward" identifies
as the CPU queue.
Core-Metro#sh platform tcam table acl index 8081 detail
=============================================================================
ACL Cam Table (#entries: 8192, startIndex: 20864)
Index ACL CAM Table ACL
-----------------------------------------------------------------------------
mask-> F8_00000000_00000000-FF_C500FFFF_00000000
8081 40_00000000_00000000-00_810000B3_00000000 02001F6C
l3CamInputAclDescriptor Value Mask
-----------------------------------------------------------------------------
lookupType: 4 F
cos: 0 0
inputAclType: 0 1
l3Destination: 00.00.00.00 00.00.00.00
l3Source: 00.00.00.00 00.00.00.00
inputAclLabel: 0 FF
l4Select: 2 3
l3DontFragment: 0 0
l3MoreFragments: 0 0
l3SmallOffset: 0 0
l3NotFirstFragment: 0 1
l3ReservedFlag: 0 0
l2Bcast: 1 1
l3Dscp: 0 0
l3TosReserved: 0 0
l4Destination: B3 FFFF
l4Source: 0 0
l4Map: 0 0
AclDescriptor
-----------------------------------------------------------------------------
aclStatisticsIndex: 2
aclLogIndex: 0
denyRoute: 0
denyBridge: 0
spanDest0En: 0
spanDest1En: 0
redirectIndex: 1F6C
I guess this can be exploited to keep the CPU usage high on 3750s,
by just passing this kind of traffic through. However it seems that
there is a limit on CPU interrupt usage since I haven't seen this
going over 80% so far.
--
Best regards,
Calin mailto:vcalinus at hertza.ro
More information about the cisco-nsp
mailing list