[nsp] Stable 6500 hybrid code?

Nicola Foggi nfoggi@depaul.edu
Thu, 21 Nov 2002 15:46:39 -0600


Stable code for the 6500?

We haven't seen any yet :)

12.1(8b)E9 had numerous bugs that we ran into with IPX (yeah, we're
trying to get rid of it from our network) and Multicast Boundary access
lists... they fixed it supposedly in 11b something (sorry don't remember
exactly, this has been an ongoing saga) but then 11b caused one of our
MSFC2's to crash everytime it got some multicast join message (TAC
wouldn't tell us any more than that) so they gave us engineering code to
temporarily fix it... which is has...

They supposedly have put all the fixes in 12.1(13)E1 but as we saw, it
quickly was deffered only a couple days after release (I actually did
get to download it before they deffered it!) so now we're waiting to
hear if 12.1(13)E2 is stable for the MSFC2... 

We are running 6.3(5) on our Sup2's but will be upgrading to 6.3(8)...

If anyone hears they actually release stable code one of these days I
hope they post it to the list... so far there track record with us has
been horrible at best... every code release they release to fix one of
our bugs they manage to create at least 2 others...

Nicola

>>> Steve Francis <steve@expertcity.com> 11/20/02 10:47PM >>>
What are the current recommendations anyone has for stable 6500 code, 
for hybrid mode SupII/MSFC2?

(Fairly vanilla BGP, OSPF, HSRP, with some PBR)

We have been running 6.3(6) CatOS,  12.1(8b)E9 IOS.

However, this morning we got inconsistency on the CEF tables in the 
switch and the router.  At first it looked like a RPF error (switch 
would inconsistently drop packets only if the source address was routed

out  one particular peering.) Yet RPF counters did not increment.

To avoid that, we reloaded the router, then basically nothing worked, 
and we had to admin down almost all interfaces to get a working
network. 
(While you could ping an interface of the router via a router on a
local 
subnet, and things like the loopback of the router were being
advertised 
in OSPF, you could not ping the loopback from even an adjacent, shared

interface router.)  An ACL with the log keyword made individual IP's 
work, forcing CPU switching.

At this point the TAC engineer on the router tried "no mls ip unicast
", 
which caused the whole switch to crash with TLB Exception. (And even 
more fun - not respond to the console except with garbled Hex. Needed a

power cycle.)

I cannot find any bugs matching what we experienced, so I cant see what

versions fix them.

Most importantly, anyone have recommendations for stable CatOS and
IOS?

Anyone recognize the above bugs?

Anyone have any idea how to make a 6500 run again if it crashes, and 
outputs this:
TLB Exception (load/instruction fetch) occurred.

Software ver
sion =  6.3(6)
               Process ID #1b, Name = Fib
                                             EPC: 809EFC54
{stack trace}
GDB: TLB Exception (load/instruction fetch)
                                            GDB: The system has trapped

into the debugger.
          GDB: It will hang until examined with gdb.
                                                    Please use normal 
gdb. special gdb will not work on this apollo+ board
||||$S10#b4

Getting remote staff to power cycle remote core switches (which, 
incidentally, failed in such an interesting way that I could still talk

to some nodes attached to it, but it seemed to take out most nodes on 
its functionally paired switch) was not the quickest way to restore
service.

Thx

_______________________________________________
cisco-nsp mailing list  real_name)s@puck.nether.net 
http://puck.nether.net/mailman/listinfo/cisco-nsp 
archive at http://puck.nether.net/pipermail/cisco-nsp/