[c-nsp] CEF fun in SXF

Gert Doering gert at greenie.muc.de
Tue Dec 13 04:09:44 EST 2005


Hi,

On Tue, Dec 13, 2005 at 12:44:32AM -0800, Roman Sokolov wrote:
> Monday, December 12, 2005, 11:58:51 PM, you wrote:
> GD> Hmmm, I'm not sure that applies.  This seems to be just "special" stuff,
> GD> not "generic unicast packets travelling through the box".
> urpf (surprise! surprise!)

Little uRPF on that box - it's a core router with only one customer-facing
link.  So all packets not coming from that interface shouldn't be uRPF-
limited, no?

> GD> Well, the main problem was a quite unexpected memory leak in the BGP
> GD> process.  The box has more than sufficient memory for the number of routes
> GD> it's carrying (only 170k "Internet" routes, and some 200 VRF routes), 
> GD> but as BGP was leaking like hell, we got caught by surprise.
> 3BXL? And you've lost whole gig??

3B, non-XL, but 512 Mb normally are more than enough - after reboot and
BGP convergence, we have 140 Mb of free memory.  Over time, this went 
down to about 25 Mb, which was low enough that we had scheduled a reboot
"next maintenance window" - but the remaining memory was fragmented 
badly enough so that it made CEF seriously unhappy.  Or so.

> GD> Fast reboot isn't available on Sup720 - the box will always do a 
> GD> very-slow-reboot...  especially with the huge SXE and SXF images :(
> Some checks shows that difference in loading between -wan and -lan software
> with almost clean config is just about 30 seconds. But sometimes some
> unexpected delays could exist.

I haven't taken the time recently, but just the fact that SP and RP
boot the same image sequentially, and that the compact flash access is
MUCH slower than internal flash (which is not big enough to hold SXE/SXF
WAN images) really adds up...

> GD> Sure, but that doesn't keep my from trying to improve things :)
> Sometimes it's much better to leave it on it's own :)

How much worse can it get...?  A now-current BGP implementation that's worse
(stability-wise) than what was available 5 years ago?  A nice and shiny
hardware router that has more bugs in its distributed FIB tables than 
"ip route-cache distributed" had, 10 years ago?  A router that's shipped
from the factory with insufficient internal flash to hold the image
that was bought with it?

(Well, to be fair, overall the Sup720 is a nice device.  But some parts
of Cisco quality control suck big time)

gert
-- 
USENET is *not* the non-clickable part of WWW!
                                                           //www.muc.de/~gert/
Gert Doering - Munich, Germany                             gert at greenie.muc.de
fax: +49-89-35655025                        gert at net.informatik.tu-muenchen.de


More information about the cisco-nsp mailing list