[c-nsp] Best practice - Core vs Access Router

Wed Feb 10 16:55:13 EST 2010

These are great!  Thanks Leif

On Feb 10, 2010, at 1:03 PM, Leif Sawyer wrote:

> Here's some of my common aliases.   top is the one that you'll probably use
> 
> !# Global Aliases (should work on all platforms
> !
> alias exec ifsum sho int sum | incl ^\*|Interface|: |------
> 
> alias exec sib show ip interface brief | exclude (down|unass)
> alias exec sid show interface description | exclude (admin|unass)
> 
> alias exec top sho proc cpu sort 5sec | excl 0.00%  0.00%  0.00%
> 
> alias exec ip6 show ipv6
> 
> !# Cisco 3750 series, for qos asic monitoring
> # the next line will wrap, so replace underscores with spaces
> alias_exec_drops_show_platform_port-asic_stats_drop_|_excl_((e|s|:)_0|=|_Que|Statistics|Frames|^$)
> privilege exec level 1 show platform port-asic stats drop
> 
> 
> 
>> -----Original Message-----
>> From: cisco-nsp-bounces at puck.nether.net
>> [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of David Prall
>> Sent: Wednesday, February 10, 2010 10:19 AM
>> To: 'Andy B.'; 'Phil Mayers'
>> Cc: 'nsp-cisco'
>> Subject: Re: [c-nsp] Best practice - Core vs Access Router
>> 
>> Andy,
>> By excluding 0.00 your excluding those that have had 0.00
>> anywhere in the time list. Just use sort and look at the top
>> few. Although most likely the same.
>> 
>> If you have a number of large Ethernet subnets with few
>> systems on them, then "sh ip arp" will contain a number of
>> incompletes. If it is the entire subnet filled with
>> incompletes then someone is looking for all of your systems
>> and is most likely doing a ping sweep, then enabling "mls
>> rate-limit unicast cef glean" will be worthwhile. These are
>> both Adj Manager and ARP Input I believe.
>> 
>> The other one is if you've run out of TCAM space, because
>> your over the limits with the number of routes you have.
>> Don't know if you're running an XL or not.
>> 
>> CPU doesn't look out of order currently. Need to capture it
>> ongoing to see what process is pushing it to 24%, and even
>> then it should still be forwarding traffic.
>> 
>> You might need to look at the DFC's as well, to see if one is
>> having issues:
>> Remote command module X sh proc cpu sort
>> 
>> David
>> 
>> --
>> http://dcp.dcptech.com
>> 
>> 
>>> -----Original Message-----
>>> From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-
>>> bounces at puck.nether.net] On Behalf Of Andy B.
>>> Sent: Wednesday, February 10, 2010 1:44 PM
>>> To: Phil Mayers
>>> Cc: nsp-cisco
>>> Subject: Re: [c-nsp] Best practice - Core vs Access Router
>>> 
>>> I am currently facing this strange behaviour once again. Nothing
>>> suspicious in terms of CPU:
>>> 
>>> #sh proc cpu sort | ex 0.00
>>> CPU utilization for five seconds: 7%/3%; one minute: 24%;
>> five minutes:
>>> 23%
>>> PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min
>> TTY Process
>>> 123   823552748 891845755        923  1.35%  1.32%  1.24%
>> 0 IP Input
>>> 142    42990360 548209142         78  0.63%  0.15%  0.06%
>> 0 IP SNMP
>>> 176    81597832 313530395        260  0.63%  0.20%  0.12%   0 SNMP
>>> ENGINE
>>> 286    95557652  68837887       1388  0.31%  4.77%  4.27%   0 BGP
>>> Router
>>>  46        8724      6895       1265  0.31%  0.33%  0.24%   2 SSH
>>> Process
>>> 169    98755140   5844411      16897  0.31%  0.31%  0.31%   0 Adj
>>> Manager
>>>   9    92740444 222352412        417  0.23%  0.40%  0.41%   0 ARP
>>> Input
>>> 320    20411156 140247526        145  0.15%  1.64%  1.57%
>> 0 BGP I/O
>>> 180    64470940  51288798       1257  0.15%  0.58%  0.44%   0 CEF
>>> process
>>> 167    27190044 390437731         69  0.15%  0.12%  0.10%   0 IPv6
>>> Input
>>> 
>>> #remote command switch sh proc cpu sort | ex 0.00 CPU
>> utilization for
>>> five seconds: 10%/0%; one minute: 14%; five
>>> minutes: 20%
>>> PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min
>> TTY Process
>>> 102   577414400  14603714      39539  5.19%  2.76%  2.58%   0 Vlan
>>> Statistics
>>>  42  11702922242664309865          0  3.91%  3.83%  3.87%   0 slcp
>>> process
>>> 257    79620728  46604862       1708  0.23%  1.31%  0.92%   0 CEF
>>> process
>>> 152    24224440  35123075        689  0.15%  0.08%  0.07%
>> 0 CEF LC
>>> Stats
>>>  33    29231032 224654615        130  0.15%  0.08%  0.07%   0 SCP
>>> Download Lis
>>> 131    39865856   1338254      29789  0.07%  0.08%  0.11%   0 TCAM
>>> Manager pro
>>> 127    37865260 135955648        278  0.07%  0.07%  0.07%
>> 0 Spanning
>>> Tree
>>> 187    12366092   3103775       3984  0.07%  0.04%  0.05%   0 v6fib
>>> stat colle
>>> 239    11888108   8600338       1382  0.07%  0.04%  0.03%
>> 0 LTL MGR
>>> cc
>>> 
>>> Packet loss to the router (nothing behind it) is around 25%.
>>> And still loosing random BGP and OSPF sessions. SNMP graphs are not
>>> being generated either.
>>> 
>>> Currently feeling quite desperate, because I have no clue where to
>>> look next...
>>> 
>>> Andy
>>> 
>>> On Tue, Feb 9, 2010 at 6:56 PM, Phil Mayers
>> <p.mayers at imperial.ac.uk>
>>> wrote:
>>>> On 09/02/10 17:39, Church, Charles wrote:
>>>>> 
>>>>> I was going by the 'show proc cpu hist' he gave for both
>> the SP and
>>> RP.
>>>>> Both looked pretty bad across the board.
>>>> 
>>>> His graphs don't look that dis-similar to mine, and we
>> have no such
>>>> problems. The peak/avg CPU don't look so unreasonable to me given
>>>> the
>>> load
>>>> and setup he's described.
>>>> 
>>>> To summarise in this thread, it has been suggested:
>>>> 
>>>> 1. Netflow is the problem - to which the OP said he's
>> already tried
>>>> disabling it
>>>> 
>>>> 2. CPU punts, specifically gleans, are the problem - in
>> which case
>>> CoPP or
>>>> MLS rate limiters can be tried, but the OP really IMHO needs to
>>> confirm this
>>>> with a span of the CPU
>>>> 
>>>> 3. The 6500 is just no good buy a juniper or asr1k (!) which I
>>> strongly
>>>> dispute. It may be awkward and have odd limits, but it OUGHT TO
>>> HANDLE the
>>>> load we've been told about; therefore something is wrong
>>>> 
>>>> ...and lots more besides. I'm exhausted from following the thread,
>>> but my
>>>> advice to the OP is to determine what is hitting the CPU
>> *during an
>>> outage*,
>>>> then proceed from there.
>>>> 
>>>> I'm going to stop reading now.
>>>> _______________________________________________
>>>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>>> 
>>> _______________________________________________
>>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>> 
>> _______________________________________________
>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>> 
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/