[c-nsp] c6k msfc - hsrp flapping

Michael Davis michael at michael-davis.com
Thu May 18 16:34:02 EDT 2006


Hi Lee,
 
CPU at 49% seems significantly high for a distributed system.  Most of the packet forwarding ought to be going on in hardware and not affecting that number. I've seen situations where an 15mbps of unfragmented traffic (that needed fragmentation to egress the switch) took a 7600 cpu utilization from 0-1% to 40%.   
 
The fact that IPC counters are incrementing suggests that the Cat's line cards are having trouble talking to the MSFC or visa versa.
 
EIGRP and HSRP both generate multicast traffic.  Multicast traffic must be process switched, as do HSRP and EIGRP control plane functions. All of which are certainly contributing to your CPU levels.  High CPU levels cause IPC errors. IPC errors can result in HSRP, EIGRP or other control plane failures, input queue drops, etc 'cause the line card is waiting for the cpu to tell it how to switch the packet.
 
Long and short, if you have a pair of 6500/7600s aggregating many vlans, with hsrp and eigrp adjacencies between all of them, you may be asking for trouble.  For starters, and immediate relief, drastically limit (via passive interface) the number of EIGRP adjacencies to only those that are required. 
 
High CPU:
http://www.cisco.com/en/US/partner/products/hw/routers/ps359/products_tech_note09186a00801c2af0.shtml#causes
 
Campus design:
 
http://www.cisco.com/application/vnd.ms-powerpoint/en/us/guest/netsol/ns432/c649/cdccont_0900aecd802e9b1a.ppt
 
HTH,
 
Mike

________________________________

From: cisco-nsp-bounces at puck.nether.net on behalf of lee.e.rian at census.gov
Sent: Wed 5/17/2006 11:15 AM
To: Michael K. Smith
Cc: cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] c6k msfc - hsrp flapping



Hi Mike

"Michael K. Smith" <mksmith at adhost.com> wrote on 05/17/2006 10:12:27 AM:

> Hello Lee
>
> On 5/17/06 4:01 AM, "lee.e.rian at census.gov" <lee.e.rian at census.gov>
wrote:
>
> > We're having a problem on only one cat6000 MSFC where all the standby
HSRP
> > interfaces go active and routing adjacencies drop.  It's like the
packets
> > are being dropped somewhere inside the switch before they get to the
MSFC.
> >
> > The supervisor isn't complaining about any problems like links flapping
or
> > spanning tree changes and none of the regular error counters on the
> > supervisor or MSFC look all that bad - but I did see some of the show
ibc
> > counters incrementing.
> >
> <snip>
>
> Here is an excellent thread on the ins and outs of HSRP timers.  I would
> take a look at your timers, particularly in relationship to your Spanning
> Tree timers.

I appreciate the reply, but if it was a problem with the HSRP timers or
spanning tree it seems like we'd be seeing problems on both of the core
6500s.

Neither of the core switches are having link flaps, topology changes, etc.
It's just the one MSFC that has all of it's standby hsrp interfaces go
active and then less than a minute later they're all back in standby mode.
The only problems logged by msfc1 are about losing the adjacency w/ msfc2
  %DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor xxx is down: peer restarted
  %DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor xxx is up: new adjacency

I've been in msfc2 when it started logging all of the hsrp state changes &
done a 'sh proc cpu'.  The highest cpu utilization I've seen was 49% so it
doesn't look like it's a cpu issue.  Input queue drops aren't that bad...
the worst case is a bit over 2000 drops in 23 hours.

The fiber links to the access layer switches come into two different cards
so I don't think it's an issue with a line card.

TAC hasn't been able to figure out the problem.  They haven't explained the
ibc counters either so I was hoping someone on the list knew what they
meant.

Thanks,
Lee


_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/







More information about the cisco-nsp mailing list