[c-nsp] Sup720 CPU spikes, an academic question
Alexander Clouter
alex at digriz.org.uk
Tue May 3 17:09:09 EDT 2011
Peter Rathlev <peter at rathlev.dk> wrote:
>
> I know a single 5 second interval of 100% CPU utilization now and then
> is rather irrelevant seen from an operational perspective. That's
> probably even more true when looking at a 600 MHz MIPS on a Sup720. This
> thing has me puzzled though. :-)
>
A burst of SNMPv3 with cryptographic operations can hurt a poor MIPS
chip. We run torrus[1] and it took me a while to realise the obvious
that polling all our kit 3DES/MD5 was probably bad idea (it was brutal
enough to the system that was doing the polling) so when with just
SNMPv2c.
> The following is the output from "show proc cpu" (slightly reformatted)
> from a device that exceeded a 90% warning threshold we've configured.
>
You really want to be looking at the '5min' sorted graph.
> CPU utilization for five seconds: 100%/0%; one min: 10%; five min: 4%
> PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min Process
> 8 870373628 51977035 16745 1.27% 0.59% 0.64% Check heaps
> 487 20306096 67521163 300 0.15% 0.04% 0.04% Port manager per
> 2 9688 5187559 1 0.07% 0.00% 0.00% Load Meter
> 358 18902200 40236967 469 0.07% 0.03% 0.02% CEF: IPv4 proces
> 23 85574908 641372631 133 0.00% 0.12% 0.08% IPC Seat Manager
> 51 111228136 4913752 22636 0.00% 0.07% 0.05% Per-minute Jobs
> 272 28800268 228265577 126 0.00% 0.10% 0.07% IP Input
> 561 55288392 590654988 93 0.00% 0.13% 0.09% ISIS Adj
> 578 16540192 166947095 99 0.00% 0.05% 0.04% HSRP IPv4
>
> I've excluded processes with 0% utilization for all three periods. To me
> the above means that 0% time (?) was spent interrupt switching,
>
...in the previous 5sec interval.
> The spikes do not seem to correlate with a lot of traffic, neither
> traffic for the RP nor traffic generally being forwarded by the box. It
> also does not correlate with IGP or BGP events or anything I'd consider
> relevant. Even the odd loop or ridiculous multicast flooding dosn't tax
> the CPU under normal circumstances.
>
multicast from a directly connected VLAN at the router with the TTL of
the packets set to 1 is how you can multicast 'attacks' on routers.
Might be something occasionally firing up (Norton Ghost) probbing for a
suitable TTL to put in it's multicast payload...but this I would expect
to appear in your ring buffer.
> What puzzles me is: What causes the RP to max out at 100% utilization in
> a case like this? Should I just ignore it altogether?
>
The sysadmin in me says look at the *runtime*/*uSecs* columns.
Good Hunting.
[1] http://torrus.org/
--
Alexander Clouter
.sigmonster says: pain, n.:
One thing, at least it proves that you're alive!
More information about the cisco-nsp
mailing list