[c-nsp] Best practice for redundancy

Thu Sep 21 14:04:23 EDT 2006

We also run UDLD and have not had any success in it "doing something"
when the scope of the issue is outside the design intention of UDLD.  
We do however have a guy here that coded up an application that scrubs
all of the interfaces on all of our devices for CRC's and runts,
compiles them, and then shoots out an email when a certain threshold is
exceeded. In our case we tolerate zero errors on trunks and we tolerate
a few hundred on access port.(Generally duplex mismatches) It's saved
our tails a few times especially on serial interfaces. 

An email sorta looks like this- 
Device       : 172.x.x.x 6/33 
Daily Errors : 5255
Weekly Errors: 5308
Unit MAC Addr: N/A
Unit IP  Addr: N/A
Unit NB Name : N/A 
Port Name    : PC Drop D-013 
Inf Index    : 83 

Michael Balasko
CCSP,CCDA,MCSE,MCNE,SCP
Network Specialist II
City of Henderson
240 Water St. 
Henderson, NV 89015
p. 702-267-4337
f.  702-267-4302

-----Original Message-----
From: cisco-nsp-bounces at puck.nether.net
[mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Phil Mayers
Sent: Thursday, September 21, 2006 2:01 AM
To: Asbjorn Hojmark - Lists
Cc: 'cisco-nsp'
Subject: Re: [c-nsp] Best practice for redundancy

Asbjorn Hojmark - Lists wrote:
>> If anyone has experience of UDLD succeeding or failing to detect 
>> errors on links, I'd like to hear about them.
> 
> In my experience it works fine.
> 
> To test it, take two active (up/up) links and swap the two rx or tx 
> fibers at one end. (UDLD also works for defektive GBICs, but you may 
> have trouble finding one).

Oh, it certainly works very well for that class of "hard" errors, and is
on by default on all our links that will support it.

I was thinking more pernicious errors e.g. high bit-error rates where
the link will pass traffic, but much of it ends up discarded, which is
particularly disastrous with the various load-balancing schemes used for
aggregates - approx. 50% of your hosts at the other end of the link end
up with symptoms similar to duplex mismatches! We've seen fibre patches
(particularly single mode) occasionally go bad and exhibit these
symptoms.

Running multiple layers of protocols e.g. UDLD, BFD, OSPF, BGP is
clearly more likely to fail with such a link than one layer, so arguably
running as two separate layer3 links might detect such faults faster. 
Frankly I doubt it's worth the marginal gain, but was wondering if
anyone had such comparative experience.
_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/