[j-nsp] Question about Routing Engine Redundancy on MX

Richard A Steenbergen ras at e-gerbil.net
Wed Jan 9 16:23:44 EST 2013


On Wed, Jan 09, 2013 at 03:43:04PM -0500, Paul Stewart wrote:
> Thanks RAS.. that's interesting as we've never actually tried that...
> 
> Have you tried this in a production environment or would you?  Do we 
> have any idea on whether or not JTAC would support this configuration 
> officially?
> 
> I realize these are loaded questions - just really curious on this 
> topic as it opens up some "new possibilities" for us in some 
> deployments...  Our SE basically told us to "run" from this idea 
> previously...

The difference between an RE-2000 and an RE-1300 is pretty minimal. Yeah 
it's a slightly slower CPU, and maybe it has less RAM, but the 
architecture itself is still basically the same. If you configure NSR 
you'll be passing a lot of internal state in raw form back and forth, so 
you have no hope of making it work between completely different 
architectures like the REs which run JUNOS 64, but technically speaking 
there is nothing that would prevent something like RE-1300 and RE-2000 
from talking and working.

Personally I wouldn't run any of it in production, after having been 
bitten by way too many extremely severe bugs in NSR/GRES over the last 
many years. I've probably suffered 1000x more operational impact from 
NSR related bugs than I've EVER saved from NSR working correctly, and 
don't even get me started on the massive design flaws of GRES. At this 
point you're MUCH more likely to make your router work correctly if you 
turn on as few knobs as possible, and NSR is a pretty darn complex thing 
to actually make work correctly.

Plus, I don't think I've actually had NSR work correctly in about 4-5 
years now. There are hard-coded time-outs during the NSR sync process 
after the backup RE reboots, and if your network is big enough that you 
carry some decent number of BGP paths it will take so long to sync that 
this will time out and fail the entire process. I once had a case open 
about this issue, but after about 1.5 years of being unable to explain 
it to the idiot in JTAC I just gave up. I checked several years later, 
and it was still broken in exactly the same way, so I'm going to guess 
that no other large network dares to run NSR either. :)

-- 
Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)



More information about the juniper-nsp mailing list