[j-nsp] iBGP convergence time

Mon Feb 19 10:27:40 EST 2007

We are seeing the same on M10i's RE-850 after upgrading to 7.6R2.6 and also M7i's (same software and RE).

A fix would be nice indeed.

cheers

On Mon, Feb 19, 2007 at 02:25:55PM +0000, Ian MacKinnon wrote:
> Well we are running M40 and seeing very slow BGP convergence times, that
> this thread seems to describe well.
> 
> We first noticed this after we upgraded to 8.0
> 
> A fix would be nice :-)
> 
> 
> 
> Richard A Steenbergen wrote:
> > On Fri, Feb 02, 2007 at 04:13:27AM -0500, Richard A Steenbergen wrote:
> >> You know I haven't had any free time to run real tests or anything, but I 
> >> noticed a significant increase in BGP convergence time when upgrading from 
> >> JUNOS 7.2/7.3 (and I think some 7.4) to 7.6. When you make a policy 
> >> change, the routes take several minutes (from 2 to 7) to install. If you 
> >> do a show route you can see the new routes sitting in + and the old routes 
> >> sitting in - for minutes, RPD is spinning its heels at 100% cpu, and the 
> >> packets continue to forward over the old path while it is processing.
> > 
> > Ok so, after about a dozen people contacted me privately to confirm that 
> > they were seeing similar issues that hadn't been fully acknowledged, I ran 
> > off and did a little testing to replicate the issue. The one thing I can 
> > definitely confirm right now is that it only appears to affect M160 (or at 
> > least, not M5-M40).
> > 
> > On an RE-2.0 on a single switch board platform, performance is about what 
> > you would expect from a 7+ year old routing engine running modern code on 
> > a modern routing table. It syncs a full table inbound (from empty) on an 
> > otherwise unused RE-2.0 in just under 7 minutes, and it processes a switch 
> > of best path and installed routes on a full table is just barely under 2 
> > minutes (policy-statement with then local-preference only, no other 
> > processing). However, on an M160 the switch of best path which leads to 
> > installing new routes in the HW takes between 8-15 minutes in my tests. I 
> > haven't yet had time to go through every version one at a time to find 
> > exactly where there starts, but the behavior is definitely evident.
> > 
> > The following is based on absolutely nothing except my random guess as to 
> > what is happening, so someone please let me know if I'm warm or cold. It 
> > seems that the easiest way to replicate the state of new route showing up 
> > as "+" and the old route showing up as "-" is to intentionally break 
> > connectivity between the RE and PFE (easy with an m40 :P). My guess is 
> > that this is a kind of transactional FIB installation system, where the RE 
> > doesn't update its RIB to reflect that the new route has been installed 
> > until the switch board processes it and confirms it (and allowing it to 
> > retry the install if necessary), to prevent Customer Enragement Feature 
> > with Juniper's move towards distributed forwarding tables on the Gibson 
> > architecture. Whatever is going on with the M160 on RE-2.0 however, it is 
> > significantly slower. Maybe this just wasn't sufficiently regression 
> > tested on the M160 platform, or maybe it is just a natural effect of 
> > having 4 switch boards which all need to be updated, but it is very 
> > noticable. The offical Juniper line seems to be "just upgrade your REs", 
> > but it would be nice if we had an alternate option.
> > 
> > So, two things. The most obvious question is, is there a way to turn this 
> > behavior off or revert it to the previous behavior (if infact my guess as 
> > to the cause is correct :P)? The next question is, I noticed in the 
> > release notes for 8.2 that there is a new option to support indirect 
> > next-hops which may significantly reduce the number of FIB updates. My 
> > take on this feature is that you are changing from installing BGP route -> 
> > physical next-hop to BGP route -> BGP next-hop and BGP next-hop -> 
> > physical next-hop, so that when you make a routing change to the BGP 
> > nexthop you only have to update the 1 entry instead of the potentially 
> > thousands of entries for the BGP route itself (which I kinda thought 
> > Juniper had done since forever :P). Am I correct in that interpretation, 
> > or is there something else going on there?
> > 
> 
> -- 
> 
> This email and any files transmitted with it are confidential and intended 
> solely for the use of the individual or entity to whom they are addressed.  
> If you have received this email in error please notify the sender. Any 
> offers or quotation of service are subject to formal specification.  
> Errors and omissions excepted.  Please note that any views or opinions 
> presented in this email are solely those of the author and do not 
> necessarily represent those of Lumison, nplusone or lightershade ltd.  
> Finally, the recipient should check this email and any attachments for the 
> presence of viruses.  Lumison, nplusone and lightershade ltd accepts no 
> liability for any damage caused by any virus transmitted by this email.
> 
> -- 
> -- 
> Virus scanned by Lumison.
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
> 

-- 
andy    andy at shady.org
-----------------------------------------------
Never argue with an idiot. They drag you down 
to their level, then beat you with experience.
-----------------------------------------------