[nsp] Counting bps...

Gert Doering gert at greenie.muc.de
Mon Mar 17 15:19:53 EST 2003


Hi,

On Mon, Mar 17, 2003 at 02:20:04PM +0100, Iva Cabric wrote:
> - bps calculated by router on load-interval (these should correspond
>   with "show interface" statistics)

They don't.  We have seen up to 30% difference (both in individual 
samples, and also in "take a sample every 5 minutes and average over the
day").

[..]
> I have done some measurements, with above counters (load-interval was
> set to 30 seconds, values were collected every 60 seconds), and results
> were almost identical. Differences between different counter type were
> less then 0.3% (sums of all bps values collected in 20 hours).

Lucky you :-) - various IOS releases have very interesting misbehaviour
in various parts of the whole counter mess.

One of the most spectactular ones is in early 12.0S, where you can
have "5 minute output" values well over a Gbit/s. on a FastEthernet
interface.

[..]
> Does anyone have other (or more) experience with bps statistics gathering
> and would like to share information about it and ways of doing it?

We gather both 5 minute average and byte counters, both from "show int"
and from SNMP, and compare all 4 values against each other.  

Most of the time, byte counters from "show int" and SNMP agree pretty
well, except for a couple of problem areas:

 - "show int" are only 32 bit counters

 - on some platforms and interfaces, querying the 64 bit counters always 
   returns "0" or (much worse) some leftover value from another interface

 - "show int" counters occasionally get stuck (fairly reproduceable, but
   seems to be fixed in 12.0(21)S5 and 12.0(23))q

 - SNMP counters get stuck (new issue, as far as I can see, in 12.0(high)S)

 - "show int" counters occasionally hickup and output a value that is
   lower than on the previous query, or much higher than it should be.  
   The next query is back to normal.  Does wonders to averaging.

 - sometimes, "show int" counters need a "show int ... accounting" call
   to be updated - otherwise they just stop moving


The "5 minute ... rate" values are nice to look at, but have been very
unreliable in the past.  11.2P sometimes doubled the output rate if 
route-cache trashing occured, 12.0(x)S sometimes injects multi-gbit-spikes
into the average calculation upon 32bit wraparound.  SNMP and "show int"
values disagree.  And so on.

gert

-- 
USENET is *not* the non-clickable part of WWW!
                                                           //www.muc.de/~gert/
Gert Doering - Munich, Germany                             gert at greenie.muc.de
fax: +49-89-35655025                        gert.doering at physik.tu-muenchen.de


More information about the cisco-nsp mailing list