[j-nsp] SRX Active/Active
Brian Spade
bitkraft at gmail.com
Sun Jun 26 23:54:43 EDT 2016
Hi Aaron,
On Sun, Jun 26, 2016 at 2:08 PM, Aaron Dewell <aaron.dewell at gmail.com>
wrote:
>
> Hi Brian,
>
> Those all are good mitigation steps for an RG0 failover. There are some
> caveats about graceful restart on the SRXs, but those should have been
> fixed a while ago. Just to be sure, I’d get a recommendation on Junos
> version from your local SE.
>
> Another option is to not use a cluster at all, and make them active/active
> via routing protocols. Then, a control plane failure only kills one side.
> But then failovers are stateless which has more impact.
>
> However, control plane failures are rare, so it’s chasing a very small
> probability in the end.
>
> Aaron
>
>
Wow, very good idea. I actually didn't consider using them independently.
This might be a good way to achieve our goals. Like you mention, I will be
trading one type of failure over another. But I like this idea since I'd
have more control with standard routing protocols.
Thanks.
/bs
> On Jun 26, 2016, at 12:40 PM, Brian Spade <bitkraft at gmail.com> wrote:
>
> Hi Aaron,
>
> On Sun, Jun 26, 2016 at 11:19 AM, Aaron Dewell <aaron.dewell at gmail.com>
> wrote:
> >
> > You are correct - RG0 will always be active/passive. A full control
> plane failover will always be painful.
> >
> > SRX active/active is more about the interfaces in use. You can arrange
> for half of your traffic to prefer FW1 vs. FW2 and achieve active/active in
> that way so you’ll take less of a hit when an interface fails (or a
> neighbor device goes down). So that’s really what you are protecting
> against, which seems like you’ve done that.
> >
>
> Thanks for your feedback. It will be a lot of configuration, but was
> thinking I could do the following to limit RG0 failure (or southbound Core
> failure):
>
> - /31 transit VLAN per link (per VRF). So the total number of /31
> transit's needed will be 4 * # of VRFs (28 /31's in my case).
> - Graceful restart configured on the SRX to limit RG0 failure.
> - Core1 failure (or Core2 failure) should be limited with graceful
> restart and all uplinks having an OSPF adjacencies.
>
> Anyways, just wondering your thoughts on this. I will probably just have
> to lab it to see how it performs.
>
> If active/active is not a good way, I might have to add in two MX border
> routers... That seems like a waste since I just need a default route via
> BGP.
>
> Thanks.
> /bs
>
> >> On Jun 26, 2016, at 12:15 PM, Brian Spade <bitkraft at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I'm trying to figure out the best way to setup an SRX cluster as
> >> active/active. I have attached a diagram of the topology, but it's a
> >> full mesh of links. The ISP links are local interfaces and the
> >> southbound interfaces to the core routers are reth's. Core1 is HSRP
> >> primary for all VLANs. FW1 is primary for RG1 and FW2 is primary for
> >> RG2. The IGP is OSPF but have many VRFs that are connected to the FW
> >> with transit VLANs to bind the sub-interface to virtual router & zone.
> >>
> >> The issue I have is Core2 has no active OSPF neighbors in this setup.
> >> Therefore, if Core1 fails, there will be a control outage as Core2
> >> establishes OSPF adjacencies.
> >>
> >> So I'm thinking it might be better to remove the reth's and use local
> >> interfaces on the FW/CORE links. This way I can have a full mesh of
> >> OSPF adjacencies and no control plane loss when Core1 fails.
> >>
> >> Does anyone have thoughts on this or recommend the best way to achieve
> >> this active/active full mesh setup? If there's good reason to not use
> >> active/active, I'd welcome the feedback.
> >>
> >> Thanks.
> >> /bs
> >> _______________________________________________
> >> juniper-nsp mailing list juniper-nsp at puck.nether.net
> >> https://puck.nether.net/mailman/listinfo/juniper-nsp
> >
>
>
>
More information about the juniper-nsp
mailing list