[j-nsp] MX960 Redundant RE problem

Daniel Roesen dr at cluenet.de
Wed Feb 15 13:56:57 EST 2012


On Wed, Feb 15, 2012 at 12:24:50PM -0500, Stefan Fouant wrote:
> The cool thing is the Backup RE is actually listening to all the
> control plane messages coming on fxp1 destined for the Master RE
> and formulating it's own decisions, running its own Dijkstra,
> BGP Path Selection, etc. This is a preferred approach as opposed
> to simply mirroring routing state from the Primary to the Backup
> is because it eliminates fate sharing where there may be a bug
> on the Primary RE, we don't want to create a carbon copy of that
> on the Backup.

I don't really buy that argument. Running the same code with the same
algorithm against the same data usually leads to the same results.
You'll get full bug redundancy - I'd expect RE crashing simultaneously.
Did NSR protect from any of the recent BGP bugs?

The advantage I see are less impacting failovers in case of a) hardware
failures of active RE, or b) data structure corruption happening on both
REs [same code => same bugs], but eventually leading to a crash of the
active RE sooner than on the backup RE, or c) race conditions being
triggered sufficiently differently timing-wise so only active RE
crashes.

Am I missing something?

Best regards,
Daniel

-- 
CLUE-RIPE -- Jabber: dr at cluenet.de -- dr at IRCnet -- PGP: 0xA85C8AA0


More information about the juniper-nsp mailing list