[c-nsp] CEF Scanner eating CPU in Supervisor 720
Ian Dickinson
iand at eng.pipex.net
Fri Jun 9 16:23:07 EDT 2006
Is there HSRP on any of the interfaces where there were duplicate IPs?
I've seen Sup's die a 100% CPU death in this scenario, and not recover when
the duplicate was removed.
Ian
Rodney Dunn wrote:
> Peter,
>
> Someone mentioned to me (and thy were watching this alias and
> our thread) that they were seeing a similar issue.
>
> I tried to recreate it in the lab with this:
>
> R1 --
> R2 -- UUT --- R3 -- R4
>
> I sent 1k routes from R2 to UUT as EBGP routes.
> I originated 1k OSPF routes adn 1k iBGP routes from R4.
> I had MPLS on between UUT and R3.
>
> I then turned up R1 with duplicate ip address.
>
> I do see the adj flap as the two devices argue about who should
> own that ip address.
>
> And I see the adj update in 'sh ip cef ev' but I'm not able to
> get the CEF scanner to run high.
>
> On Monday I'll see if I can get a 76xx with SXE5 on it and try.
>
> If you could try without MPLS and or contact me offline and get
> me remote access I'll look at it with you if you can recreate it.
>
> In newer code that is coming out the CEF scanner doesn't exist
> anymore. But I'd still like to understand what's happening here
> because even if one adj is flapping it doesn't seem normal that the
> scanner would constantly run like that.
>
> Rodney
>
> On Fri, Jun 09, 2006 at 08:39:21AM -0400, Rodney Dunn wrote:
>
>>On Fri, Jun 09, 2006 at 08:56:47AM +0200, Peter Salanki wrote:
>>
> I think the problem s not at all MPLS related. I did a perl hack that
> paresed the output of sh ip cef event new and added static arp on the
> hosts which flapped rapidly, 7 IPs had abnormal activity. The load is
> now down to a more acceptable level of 20% avg. I could remove the
> statics and isable MPLS on the core facing interfaces just to make
> sure that MPLS has nothing to do with it if you want.
>>>
>>>I hate asking for things from production networks but if you could
>>>do that without too much trouble that would be a good data point
>>>to have.
>>>
>>>Also, those macs are they on the downstream interfaces (non mpls
>>>enabled interfaces)?
>>>
>>>Are there any routes resolving through those arp/mac's on those interfaces?
>>>
>>>
>>>How are the arps changing?
>>>
>>>
>>> Do you have any
>>>
> case about this, and/or any plans of "fixing" it?
>>>
>>>I need to understand a little more about what the root problem is first.
>>>
>>>I don't like the
>>>
> thought of directly connected kiddies being able to drain all cpu on
> my (imo. not cheap) sup720-3bxl by just stealing eachothers IP
> addresses.
>>>
>>>Help me understand a bit more about what is actually going on to trigger
>>>it and we'll see what we can do.
>>>
>>>
>>>I'm not a l2 person but isn't there stuff about securing macs on ports, etc.?
>>>
>>>
> 9 jun 2006 kl. 00.32 skrev Rodney Dunn:
>
>
>>One trick is you can do a 'sh ip cef ev new' and do it over and
>>over. See which ones are flapping.
>
>>How many routes do you have?
>
>>Can you turn off logging to the console: no logg con
>
>>and run a couple of mpls debugs and let's see what that says:
>
>>debug mpls lfib cef
>>debug mpls lfib enc
>
>>Set the lot to a couple of meg.
>
>>Rodney
>
>>On Thu, Jun 08, 2006 at 11:13:02PM +0200, Peter Salanki wrote:
>
>>>The CEF Scanner is now eating almost all CPU :/
>>>
>>>The events table doesn't look any particular to me,
>>>--SNAP--
>>>
>>>+00:00:00.000: 81.170.148.226/32 ADJ (Vl4001) update
>>>[OK]
>>>+00:00:00.024: 195.178.160.138/32 ADJ (Vl19) update
>>>[OK]
>>>+00:00:00.052: 81.170.138.13/32 ADJ (Vl604) update
>>>[OK]
>>>+00:00:00.232: 81.170.152.129/32 ADJ (Vl4003) update
>>>[OK]
>>>+00:00:00.240: 81.170.148.118/32 ADJ (Vl4001) update
>>>[OK]
>>>+00:00:00.304: 81.170.149.246/32 ADJ (Vl4001) update
>>>[OK]
>>>+00:00:00.320: 81.170.152.50/32 ADJ (Vl4003) update
>>>[OK]
>>>+00:00:00.380: 81.170.154.117/32 ADJ (Vl4004) update
>>>[OK]
>>>+00:00:00.388: 213.136.56.90/32 ADJ (Vl39) update
>>>[OK]
>>>+00:00:00.400: 81.170.136.79/32 ADJ (Vl504) update
>>>[OK]
>>>+00:00:00.416: 195.178.160.173/32 ADJ (Vl19) update
>>>[OK]
>>>+00:00:00.512: 81.170.164.163/32 ADJ (Vl4009) update
>>>[OK]
>>>+00:00:00.728: 81.170.130.75/32 ADJ (Vl204) update
>>>[OK]
>>>+00:00:00.736: [Default] 199.3.108.0/24 NBD modified
>>>[OK]
>>>+00:00:00.736: [Default] 199.3.109.0/24 NBD modified
>>>[OK]
>>>+00:00:00.820: 195.178.186.24/32 ADJ (Vl666) update
>>>[OK]
>>>+00:00:00.832: 81.170.160.3/32 ADJ (Vl4007) update
>>>[OK]
>>>+00:00:00.868: 81.170.164.33/32 ADJ (Vl4009) update
>>>[OK]
>>>+00:00:00.944: 81.170.132.159/32 ADJ (Vl304) update
>>>[OK]
>>>+00:00:00.952: 81.170.128.77/32 ADJ (Vl104) update
>>>[OK]
>>>+00:00:01.008: 81.170.149.246/32 ADJ (Vl4001) update
>>>[OK]
>>>+00:00:01.128: 194.68.123.141/32 ADJ (Vl15) update
>>>[OK]
>>>--More--
>>>
>>>
>>>8 jun 2006 kl. 19.40 skrev Rodney Dunn:
>>>
>>>
>>>>Are you running MPLS on the box?
>
>>>>Check the sh ip cef event outut and see if you have a /32 ADJ
>>>>for a mac constantly changing. That's the most common trigger
>>>>I've seen for the scanner running high.
>
>>>>You are forcing CEF to constantly reresolve prefixes.
>
>>>>Rodney
>
>>>>On Thu, Jun 08, 2006 at 02:23:22PM +0200, Peter Salanki wrote:
>
>>>>>-----BEGIN PGP SIGNED MESSAGE-----
>>>>>Hash: SHA1
>>>>>
>>>>>Hello,
>>>>>
>>>>>Process "CEF Scanner" is eating average 60% of the CPU on one of my
>>>>>Sup720-3BXL. This leads to snmp responses being delayed and full
>>>>>BGP
>>>>>updates taking a long time. I have not seen this on any of my other
>>>>>sup720s. What differs this box from the rest is that this one has a
>>>>>lot of directly connected hosts ~10 SVIs with 300 hosts each
>>>>>(on /23
>>>>>subnets). I have tried setting arp timeout to 1200 on those SVIs,
>>>>>which resulted in a small CPU utilization decrease. What can I
>>>>>do to
>>>>>calm down the CEF Scanner? I'm running 12.2(18)SXF4.
>>>>>
>>>>>CPU utilization for five seconds: 44%/4%; one minute: 38%; five
>>>>>minutes: 38%
>>>>>PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY
>>>>>Process
>>>>>119 103495040 719635 143819 35.40% 23.87% 21.54% 0 CEF
>>>>>Scanner
>>>>>
>>>>>Sincerely
>>>>>
>>>>>Peter Salanki
>>>>>Chief Network Engineer
>>>>>Bahnhof AB (AS8473)
>>>>>www.bahnhof.se
>>>>>Office: +46855577132
>>>>>Cell: +46709174932
>>>>>
>>>>>
>>>>>
More information about the cisco-nsp
mailing list