[j-nsp] 8.2R2.4 -> 8.4R2.4 route installation delay

Ian MacKinnon ian.mackinnon at lumison.net
Thu Dec 13 04:16:02 EST 2007


We are seeing something very similar,  JUNOS 8.0R2 on M40


Richard A Steenbergen wrote:
> On Wed, Dec 12, 2007 at 08:50:37PM -0600, Kevin Day wrote:
>> Tonight we upgraded from 8.2R2.4 to 8.4R2.4 on a production M20.  
>> Everything seemed to go well except for one problem that I'm not sure  
>> I can explain. I did a full reboot of the router after the upgrade.  
>> BGP sessions started coming up fine, but the router was sending  
>> "network unreachable" messages for routes that "show route" was  
>> displaying. Doing a "show route extensive" showed that many routes  
>> were in the state "<Record Pending>". The header of "show bgp sum"  
>> said that there were 150,000+ routes stuck in the "pending" column.
> 
> Hehe welcome ot my hell. I've been dealing with this issue across many 
> platforms (ranging from old M160s w/RE-2.0 to new MX960s w/2GHz REs) for 
> quite some time now. I actually posted about this behavior on this list a 
> while back even, at the time I thought it was strictly an RE-2.0 issue, 
> but tests on newer platforms and REs seems to indicate that it isn't. It 
> definitely started somewhere in the mid 7.x's at any rate, never saw this 
> issue in earlier code.
> 
> It looks like the actual issue is with the installation of the routes to 
> the PFE. BGP has no problem selecting the new paths quickly, but something 
> causes it to block the installation of the new paths to hardware (for 
> anywhere from a few minutes to MANY minutes) until eventually it seems to 
> go pop and install all the pending updates. If you look at a specific 
> route (show route) when this is happening, you'll see a + entry for the 
> newly selected path, and a - entry on the old path its trying to remove. 
> As long as its in this state, the hardware is still forwarding on the old 
> path (which is a really bad thing if that old path is now down, the router 
> WILL sit there and blackhole your bits for extended periods of time).
> 
> I've been beating my head (and Juniper's :P) on this one for well over a 
> year now, and despite a few attempted fixes so far nobody seems to have a 
> clue what the real issue is. I know I'm not the only one seeing this, I've 
> had a dozen other people tell me privately about seeing the exact same 
> behavior, but I can't seem to find anything which we do or don't have in 
> common that would be causing it. It's also difficult to reproduce on 
> demand, sometimes there is no issue at all, 5 minutes later you can do the 
> same routing change and see major impact on a few or large number of 
> routes. I've seen it block on installation of anything from a full table 
> after a major policy change, reboot, RE swap, etc, to 50 routes blocking 
> for 10 minutes after clearing a small bgp session on an otherwise unloaded 
> router. I've even seen it happen when I added a static route. :)
> 


-- 

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed.  
If you have received this email in error please notify the sender. Any 
offers or quotation of service are subject to formal specification.  
Errors and omissions excepted.  Please note that any views or opinions 
presented in this email are solely those of the author and do not 
necessarily represent those of Lumison, nplusone or lightershade ltd.  
Finally, the recipient should check this email and any attachments for the 
presence of viruses.  Lumison, nplusone and lightershade ltd accepts no 
liability for any damage caused by any virus transmitted by this email.



More information about the juniper-nsp mailing list