[nsp] HSRP/6509/CatOS latency and packet loss

tony at tonymucker.com tony at tonymucker.com
Fri Oct 31 02:25:38 EST 2003


I have two 6509s (we'll call them switch A and B), with 2 SUP1A/MSFC2 in them
each (one active, one standby), running CatOS.  There are three HSRP groups,
one for the internal back-end network, one for the public network (that is
behind a firewall), and one that the firewall inside VLAN, that is used to talk
with the two other 6509's w/FWSMs that do our public side routing and the BGP. 
The MSFCs in switch A and B handle the routing for the internal network.

Today we were getting these guys ready for a data center move that is coming
up.
 The idea was to migrate hosts from the B switch to the A switch, and down the
B switch to use it for the move.  When we got in this morning, the B switch was
the active router for the internal back-end network and the public network. 
The A switch was active for the vlan that spans the routers to the switches. 
One of the first things we did was change the HSRP priority on switch A to
pre-empt B, in anticipation of bringing down B, and to make sure there were no
problems with A.  This was done around 11AM this morning (Thursday).

We began to migrate hosts after lunch.  We're talking somewhere on the order of
90 hosts, and didn't notice a problem until we had around 30 left, which was
around 5pm.  We cleared the CAM tables many times (thinking a problem with our
F5 load balancers had cropped up again), and rebooted the F5s just to be sure. 
We ended up finding out that the A switch was at fault.

Any traffic that passed through the A switch would be subject to packet loss
and
high latency (500-700ms).  This included hosts who were physically on switch A
and in the same VLAN, as well as my laptop that was plugged directly to a
copper port on switch A, trying to ping the switch itself.  We remotely logged
into a host on switch B, and pinged another host on switch B with no problems. 
Making switch B the active router on the HSRP was our last resort before
failing the SUP module on switch A over to it's standby.

I'm leaning towards a bad SUP module on switch A.  Tomorrow I plan on contacting
Cisco, and next time we try anything with these units I'm going to have them on
the call, just in case.  But before then I figured I'd appeal to the real font
of knowledge, this mailing list.

Thanks
Tony

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



More information about the cisco-nsp mailing list