[cisco-bba] need help on troubleshooting high cpu on 7206NPE300 LNS
antnada at hotmail.com
Fri Jan 26 09:21:29 EST 2007
The situation seems to be under control now as cpu level went down to the average of 30+ % even at peak. Once again, I would like to thank everyone for sharing their thoughts.
I believe a mix of mis-config on our radius (passing compression) + wrong mtu sizing (was set to 1492) + vpdn ip udp checksum (not disabled) contributed to my problem.
Wrong MTU sizing seems to contribte the most as once lowered to 1460, CPU utilization drops immediately.
Thank You all agian!
From: antnada at hotmail.comTo: cisco-bba at puck.nether.netDate: Thu, 25 Jan 2007 12:29:27 -0500Subject: Re: [cisco-bba] need help on troubleshooting high cpu on 7206NPE300 LNS
Thank You Aire & Paul for your suggestion. I have changed mtu size to lower at 1460 & I shall read upon on the docs again. Anthony
Subject: RE: [cisco-bba] need help on troubleshooting high cpu on 7206NPE300 LNSDate: Thu, 25 Jan 2007 18:02:46 +0100From: phorrock at cisco.comTo: antnada at hotmail.com; cisco-bba at puck.nether.net
Whilst you wait for the peak period have a look at the below URL's, they may assist if it does point to re-assembly
From: cisco-bba-bounces at puck.nether.net [mailto:cisco-bba-bounces at puck.nether.net] On Behalf Of Anthony LawSent: Thursday, January 25, 2007 2:25 PMTo: cisco-bba at puck.nether.netSubject: Re: [cisco-bba] need help on troubleshooting high cpu on 7206NPE300 LNS
Hi, Thanks for all of your input again. Since this is just the start of the day, our traffic is low at this time & sh proc cpu is showing CPU utilization for five seconds: 55%/37%; one minute: 55%; five minutes: 56% 5 484808196 103563445 4681 0.49% 0.64% 0.86% 0 Pool Manager 37 11481426841072956389 1070 17.50% 17.17% 18.04% 0 IP Input Below is how >sh ip traffic looks like sh ip trafficIP statistics: Rcvd: 674456349 total, 3035990691 local destination 9258 format errors, 3285179 checksum errors, 6694426 bad hop count 2 unknown protocol, 159176 not a gateway 0 security failures, 57 bad options, 293393 with options Opts: 0 end, 148 nop, 615 basic security, 0 loose source route 0 timestamp, 0 extended security, 148 record route 0 stream ID, 0 strict source route, 292573 alert, 0 cipso, 0 ump 0 other Frags: 3012940604 reassembled, 3424934 timeouts, 118523 couldn't reassemble 2998380890 fragmented, 3205560 couldn't fragment Bcast: 5550941 received, 3022 sent Mcast: 0 received, 0 sent Sent: 302118429 generated, 3616922117 forwarded Drop: 6396472 encapsulation failed, 163 unresolved, 0 no adjacency 4485 no route, 0 unicast RPF, 4426667 forced drop Drop: 0 packets with source IP address zeroICMP statistics: Rcvd: 10 format errors, 120 checksum errors, 469 redirects, 11499 unreachable 3762935 echo, 2838 echo reply, 0 mask requests, 0 mask replies, 5 quench 0 parameter, 65 timestamp, 1 info request, 225 other 1 irdp solicitations, 5 irdp advertisements Sent: 246725 redirects, 3280755 unreachable, 3853 echo, 3762867 echo reply 0 mask requests, 0 mask replies, 0 quench, 65 timestamp 1 info reply, 5222083 time exceeded, 3 parameter problem 0 irdp solicitations, 0 irdp advertisementsUDP statistics: Rcvd: 3031423679 total, 53 checksum errors, 5498341 no port Sent: 289151419 total, 0 forwarded broadcastsTCP statistics: Rcvd: 785273 total, 1727 checksum errors, 2886 no port Sent: 450601 totalProbe statistics: Rcvd: 0 address requests, 0 address replies 0 proxy name requests, 0 where-is requests, 0 other Sent: 0 address requests, 0 address replies (0 proxy) 0 proxy name replies, 0 where-is repliesBGP statistics: Rcvd: 0 total, 0 opens, 0 notifications, 0 updates 0 keepalives, 0 route-refresh, 0 unrecognized Sent: 0 total, 0 opens, 0 notifications, 0 updates 0 keepalives, 0 route-refreshEGP statistics: Rcvd: 0 total, 0 format errors, 0 checksum errors, 0 no listener Sent: 0 totalIGRP statistics: Rcvd: 0 total, 0 checksum errors Sent: 0 totalOSPF statistics: Rcvd: 0 total, 0 checksum errors 0 hello, 0 database desc, 0 link state req 0 link state updates, 0 link state acks Sent: 0 totalIP-IGRP2 statistics: Rcvd: 0 total Sent: 0 totalPIMv2 statistics: Sent/Received Total: 0/0, 0 checksum errors, 0 format errors Registers: 0/0, Register Stops: 0/0, Hellos: 0/0 Join/Prunes: 0/0, Asserts: 0/0, grafts: 0/0 Bootstraps: 0/0, Candidate_RP_Advertisements: 0/0 State-Refresh: 0/0IGMP statistics: Sent/Received Total: 0/0, Format errors: 0/0, Checksum errors: 0/0 Host Queries: 0/0, Host Reports: 0/0, Host Leaves: 0/0 DVMRP: 0/0, PIM: 0/0ARP statistics: Rcvd: 15597477 requests, 294820 replies, 0 reverse, 0 other Sent: 4637290 requests, 27974487 replies (1776972 proxy), 0 reverse>Are still users connected which received a framed-compression attribute before you made the change?
After making changes to our radius. I have reset all tunnels therefore bumped off everyone from their vpdn sess & I have verified that they are not receiving "compression" anymore I'll post some more stats during the peak period. Thanks. Anthony
Subject: RE: [cisco-bba] need help on troubleshooting high cpu on 7206NPE300 LNSDate: Thu, 25 Jan 2007 10:13:20 +0100From: oboehmer at cisco.comTo: ariev at vayner.net; antnada at hotmail.com; cisco-bba at puck.nether.net
encapsulating/decapsulating L2TP packets should not happen in IP Input process, this is done in the interrupt path
Anthony: Something is preventing your interfaces from interrupt-switching the traffic. Another possibility is packet re-assembly (which would be shown in "show ip traffic", as Paul just suggested). Do a "clear counter" and then check "show int stat" which interface(s) send the majority of pkts in the process path. Are still users connected which received a framed-compression attribute before you made the change?
From: cisco-bba-bounces at puck.nether.net [mailto:cisco-bba-bounces at puck.nether.net] On Behalf Of Arie VaynerSent: Thursday, January 25, 2007 8:38 AMTo: Anthony Law; cisco-bba at puck.nether.netSubject: Re: [cisco-bba] need help on troubleshooting high cpu on 7206NPE300 LNS
On 1/25/07, Arie Vayner <ariev at vayner.net> wrote:
Anthony,The high CPU on IP Input is normal, as this is where the L2TP work is being done.Also note that you have a high rate of CPU being used in Interrupts (91%/44% means that 44% is used for Interrupts). Interrupts on Cisco routers are usually linked directly to a high rate of traffic (on centralized CPU devices). I would assume you box is very close to its limit of how much traffic it can handle. Could you please send some of the "show interface" outputs (for the FastEthernet/GigE/ATM ports you might have). This would allow us to get a better estimation. You need to take into account that this is a centralized CPU platform, and all traffic is handled by the CPU. This means that the scale factor is not only a question of how many sessions you have concurrently, but also how much traffic (mostly in PPS and not BPS) they transmit. ThanksArie
On 1/25/07, Anthony Law < antnada at hotmail.com> wrote:
Dear all Thank you for all of your input. I configured vpdn ip udp ignore checksum& I have corrected a mis-config on our radius server (passing compression attribute to cisco) now that the L2TP data daemon is running normal, but I am still facing high cpu on Pool Manager & IP Inputanymore suggestions? CPU utilization for five seconds: 91%/44%; one minute: 91%; five minutes: 86% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 1 4 175 22 0.00% 0.00% 0.00% 0 Chunk Manager 2 487964 5014024 97 0.00% 0.00% 0.00% 0 Load Meter 3 1606476 870141 1846 0.00% 0.00% 0.00% 0 CEF Scanner 4 22428792 3318958 6757 0.00% 0.06% 0.05% 0 Check heaps 5 481842360 102963163 4679 9.05% 9.70% 7.90% 0 Pool Manager 37 11275060121049358292 1074 36.02% 35.07% 32.40% 0 IP Input Thank You Anthony
> Date: Wed, 24 Jan 2007 02:37:10 +0200> From: nitzan.tzelniker at gmail.com> To: antnada at hotmail.com> Subject: Re: [cisco-bba] need help on troubleshooting high cpu on 7206 NPE300 LNS> CC: cisco-bba at puck.nether.net> > You can try> > vpdn ip udp ignore checksum> > Nitzan> > On 1/24/07, Anthony Law < antnada at hotmail.com> wrote:> > Dear all,> >> > We have a 7206 w/NPE300 running as a LNS terminating pppoe sessions from our> > telco. We are concurrently running around 360 pppoe sessions. > >> > Recently. I noticed that our 7206 is having extremely high cpu, at times> > going to 100%, please see below> >> > CPU utilization for five seconds: 99%/42%; one minute: 99%; five minutes: > > 99%> > PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process> > 1 0 75 0 0.00% 0.00% 0.00% 0 Chunk Manager> >> >> > 5 472509060 101324023 4663 7.65% 8.80% 8.84% 0 Pool Manager> >> > 37 10810547881019294234 1060 22.79% 25.16% 25.51% 0 IP Input> >> >> > 101 705044020 800103660 881 18.89% 21.35% 19.34% 0 L2TP data> > daemon > > 102 53153196 10197928 5212 2.19% 0.46% 0.45% 0 L2TP mgmt> > daemon> >> >> > It seemed that Pool Manager + IP Input + L2TP data daemon together is> > causing this issue. I was searching for documents regarding this on google > > and came to this mailing list. I am wondering if you guys can help me out by> > identifying the mis-configuration that I have on my end as it is my> > understanding that a 7206 should at least take close 1000 pppoe sessions. > > Thank You in advance for your input.> >> >> > hostname LNS> > !> > boot system slot1:c7200-is-mz.122-32.bin> > boot system slot1:c7200-is-mz.120-3.T3 > > aaa new-model> > aaa authentication login default local> > aaa authentication login no_rad line> > aaa authentication ppp default group radius local> > aaa authentication ppp vpdn group radius > > aaa authorization network default group radius> > aaa authorization configuration default group radius> > aaa accounting delay-start> > aaa accounting exec default start-stop group radius > > aaa accounting network default start-stop group radius> > enable secret 5 XXXXXXXXXXXXXXXXXXXXXXXXXXX> > !> > clock timezone EST -5> > clock summer-time EDT recurring > > ip subnet-zero> > no ip source-route> > ip cef> > !> > !> > ip name-server XXXXXX> > ip name-server XXXXXX> > ip name-server XXXXXX> > ! > > vpdn enable> > !> > vpdn-group XXXXXXXX> > accept-dialin> > protocol l2tp> > virtual-template 1> > terminate-from hostname XXXXXX> > local name XXXXXXX > > lcp renegotiation always> > !> > interface FastEthernet0/0> > ip address X.X.X.X 255.255.255.192> > no ip mroute-cache> > duplex full> > !> > interface FastEthernet1/0> > no ip address> > no ip mroute-cache> > duplex full> > ! > > interface FastEthernet1/0.401> > description !!XXXXXXXXXXXXXXXXXXXXXXXX!!> > encapsulation dot1Q 401> > ip address 10.70.X.X 255.255.255.252> > no ip mroute-cache> > !> > interface FastEthernet2/0> > description !!Internet Feed!!> > ip address Y.Y.Y.Y 255.255.255.252> > no ip mroute-cache> > duplex full> > !> > interface Virtual-Template1> > mtu 1492> > ip unnumbered FastEthernet2/0> > peer default ip address pool internet1 internet2 > > ppp authentication pap vpdn> > !> > ip local pool internet1 A.A.A.A B.B.B.B> > ip local pool internet2 C.C.C.C D.D.D.D> > ip classless> > ip route 0.0.0.0 0.0.0.0 Y.Y.Y.Y> > no ip http server> > !> > ip radius source-interface FastEthernet0/0 > > radius-server host X.X.X.X auth-port 1645 acct-port 1646> > radius-server host X.X.X.X auth-port 1645 acct-port 1646> > radius-server key 7 ZZZZZZZZZZZZZZZ> >> > Anthony > >> > ________________________________> > Be one of the first to try Windows Live Mail.> > _______________________________________________> > cisco-bba mailing list> > cisco-bba at puck.nether.net> > https://puck.nether.net/mailman/listinfo/cisco-bba> >> >> >
Be one of the first to try Windows Live Mail._______________________________________________cisco-bba mailing listcisco-bba at puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-bba
Be one of the first to try Windows Live Mail.
Be one of the first to try Windows Live Mail. Windows Live Mail.
Get the new Windows Live Messenger!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cisco-bba