[c-nsp] BFD for monitoring? (was BFD expectations)

Thu Sep 23 08:29:54 EDT 2010

> Message: 1
> Date: Wed, 22 Sep 2010 19:19:57 -0400
> From: Chris Evans <chrisccnpspam2 at gmail.com>
> To: Phil Mayers <p.mayers at imperial.ac.uk>
> Cc: cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] BFD expectations
> Message-ID:
> 	<AANLkTim5OLaoyYnscw1ErSdTQ9Tg71ZRRKuOZu6Vv1Ry at mail.gmail.com
> >
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Phil you bring up a great point. Until sxi bfd code was crap on the
6500..
> We have done exstensive testing at the ECATS lab. We concluded that
450ms is
> a good number on this platform with its centralized architecture. We
tested
> this with approx 35 peers and had no issues under heavy CPU load.

This might seem a little silly, but would it be reasonable to use BFD,
say in conjunction with EEM, as a form of link-monitoring mechanism? 

I have 6500s which only have a handful of links, so presumably I could
push the timer down down to say 200-300ms. I've been looking for a
cheapish
way to do link state monitoring (I need to know when there's a blip,
even 
a very momentary one) - somewhere down the road I'd like to put in 
Accedian boxes and really get the big picture, in a smaller scale I'm 
considering nuttcp between boxes at each node to push streams around and
look for retransmits, but BFD could work too. I don't want to actually
act
on anything - that requires human intervention, and too often it is just
a 
subsecond blip for which a down-and-reconverge is inappropriate, but if 
I know it happened, that information can be passed up to the app team
and they can say "oh ok" and not do a ton of digging. 

Is this an unreasonable approach? All the boxes are sup7203Bs with DFCs,
SXH7, and we're talking about gig metro-E links, mostly dedicated-path
but a few MPLS/VPLS-pseudowires.

Thanks,
-bacon