[j-nsp] How reliable is EX multichassis? 3300 and 8200 switches

Morgan McLean wrx230 at gmail.com
Wed Oct 31 02:38:24 EDT 2012


Tried this with 12.2, it took eight seconds to switch over. I
had eight seconds where my traffic was not reaching the next switch.
Typically with the way my network is setup now, I lose maybe one ping when
set to 1s interval, so I can assume 1 second. Seems like quite a bit more
down time associated with a device failure, which happens usually when a
dumb coworker bumps one of my two redundant top of rack switches....

I was running an AE with LACP over one link per chassis to another single
switch, both set to fast. Had nonstop routing, nonstop bridging, no split
detection.

This is with no routing, just straight L2 forwarding. Eight seconds isn't
bad, but still a pretty significant gap.

I also noticed it took *63* seconds to recover from a catastrophic failure
of the master node. I can induce a "lights on but nobody home" situation by
killing process 43 under an EX3300 switch, which is devrt_kernel_thread.
This is useful since it could replicate a total crash of a switch. Link
states seem to remain up when this happens, so its a pretty good method.

Is there maybe a timeout that I could tweak to get that number lower? I
haven't had many devices crash before, but it happens.

Thanks,
Morgan

On Tue, Oct 30, 2012 at 10:27 PM, Doug Hanks <dhanks at juniper.net> wrote:

> Make sure the platform + software + configuration supports GRES + NSR +
> NSB and you're good to go.
>
>
> On 10/30/12 8:58 PM, "Luca Salvatore" <Luca at ninefold.com> wrote:
>
> >Yep I'm aware, but why are my OSPF neighbours going down when one switch
> >reboots?
> >
> >Luca
> >
> >
> >-----Original Message-----
> >From: Doug Hanks [mailto:dhanks at juniper.net]
> >Sent: Wednesday, 31 October 2012 2:42 PM
> >To: Luca Salvatore; Morgan McLean; EXT - bdale at comlinx.com.au
> >Cc: juniper-nsp at puck.nether.net
> >Subject: Re: [j-nsp] How reliable is EX multichassis? 3300 and 8200
> >switches
> >
> >GR is mutually exclusive with NSR.
> >
> >
> >You want NSR.
> >
> >On 10/30/12 5:44 PM, "Luca Salvatore" <Luca at ninefold.com> wrote:
> >
> >>I'm just playing around with this now since I have a few new EX
> >>switches not in production just yet Have a pretty simple setup with two
> >>EX4500 in VC connected to another two
> >>EX4500 in VC mode.  I'm running OSPF between them.
> >>
> >>I rebooted the master member while running a ping an it took around 40
> >>seconds to come back up. I noticed that my OSPF  adjacency went down
> >>and the delay was waiting for the OSPF neighbours to come back up.
> >>
> >>I  have:
> >>nonstop-routing configured under routing options graceful-switchover
> >>configured under chassis redundancy nonstop-bridging configured under
> >>ethernet-switching-options
> >>
> >>Would graceful-restart be a better config than non-stop routing?
> >>
> >>Luca
> >>
> >>
> >>-----Original Message-----
> >>From: juniper-nsp-bounces at puck.nether.net
> >>[mailto:juniper-nsp-bounces at puck.nether.net] On Behalf Of Morgan McLean
> >>Sent: Wednesday, 31 October 2012 11:00 AM
> >>To: Ben Dale
> >>Cc: juniper-nsp at puck.nether.net
> >>Subject: Re: [j-nsp] How reliable is EX multichassis? 3300 and 8200
> >>switches
> >>
> >>Neither of these two options show up as a configurable flag:
> >>
> >>set routing-options nonstop-routing
> >>set ethernet-switching-options nonstop-bridging
> >>
> >>I'm running 11.4R2.14 on the ex3300-48t switches.
> >>
> >>Granted, right now the VC is broken so maybe it doesn't allow me to
> >>configure it? I can head to the datacenter and upgrade these two
> >>devices to recommended release and report back tomorrow as well.
> >>
> >>Morgan
> >>
> >>On Tue, Oct 30, 2012 at 4:30 PM, Ben Dale <bdale at comlinx.com.au> wrote:
> >>
> >>> Hi Morgan,
> >>>
> >>> On 31/10/2012, at 9:06 AM, Morgan McLean <wrx230 at gmail.com> wrote:
> >>>
> >>> > Can anybody give me an idea regarding typical failover times if the
> >>> master
> >>> > in a two switch pair were to die? The quickest I've seen in my
> >>> > testing
> >>> with
> >>> > EX3300's is 45 seconds, just for L2 forwarding to continue working,
> >>> > no routing. All the ports drop link as well on the secondary switch
> >>> > while things switch over. I can have my laptop connected to the
> >>> > secondary
> >>> switch,
> >>> > passing traffic up an uplink on the secondary, and if the master
> >>> > dies it creates a 45 second interruption.
> >>> >
> >>> > Normal?
> >>> >
> >>>
> >>> Yes, but add the following to your configuration:
> >>>
> >>> set virtual-chassis no-split-detection    (you may already have this)
> >>> set routing-options nonstop-routing
> >>> set ethernet-switching-options nonstop-bridging
> >>>
> >>> and try again.  In your testing, put a 3rd switch in place with LACP
> >>> and one leg to each member.
> >>>
> >>> My testing (45/42xx) has shown L2 should be pretty much hitless under
> >>>most circumstances (except if your STP topology needs to re-converge),
> >>>and L3 should around the 1-4 seconds mark (for violent failures of
> >>>master RE).
> >>>
> >>> The worst case scenario though is re-merging a split VC, which can
> >>> take the best part of 45 seconds, so avoid split-brain scenarios
> >>> whenever possible with redundant VCP/VCPe or schedule their repair
> >>> during planned outage windows.
> >>>
> >>> Cheers,
> >>>
> >>> Ben
> >>>
> >>>
> >>>
> >>>
> >>> > Morgan
> >>> >
> >>> > On Sun, Oct 28, 2012 at 2:24 PM, Giuliano Medalha <
> >>> giuliano at wztech.com.br>wrote:
> >>> >
> >>> >> Robert,
> >>> >>
> >>> >> It was released by juniper one or two weeks ago I think.
> >>> >>
> >>> >> Take a look:
> >>> >>
> >>> >>
> >>> https://www.juniper.net/us/en/products-services/routing/mx-series/mx2
> >>> 0
> >>> 00/
> >>> >>
> >>> >> MX2010
> >>> >> MX2020
> >>> >>
> >>> >>
> >>> >>
> >>> https://www.juniper.net/us/en/products-services/routing/mx-series/mx2
> >>> 0
> >>> 00/#specifications
> >>> >>
> >>> >> But I really don't know if it will support virtual chassis without
> >>>JCS.
> >>> >>
> >>> >> Att,
> >>> >>
> >>> >> Giuliano
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Sun, Oct 28, 2012 at 3:47 PM, Robert Hass <robhass at gmail.com>
> >>>wrote:
> >>> >>
> >>> >>> On Fri, Oct 26, 2012 at 11:44 PM, Giuliano Medalha
> >>> >>> <giuliano at wztech.com.br> wrote:
> >>> >>>> Considering the MX family (240, 480 and 960 with TRIO 3D) and
> >>> >>>> the new
> >>> >>> MX-L
> >>> >>>
> >>> >>> Hi
> >>> >>> What is new MX-L - can you write a little mort ? MX80 successor ?
> >>> >>>
> >>> >>> Rob
> >>> >>>
> >>> >>
> >>> >>
> >>> > _______________________________________________
> >>> > juniper-nsp mailing list juniper-nsp at puck.nether.net
> >>> > https://puck.nether.net/mailman/listinfo/juniper-nsp
> >>> >
> >>>
> >>>
> >>_______________________________________________
> >>juniper-nsp mailing list juniper-nsp at puck.nether.net
> >>https://puck.nether.net/mailman/listinfo/juniper-nsp
> >>
> >>_______________________________________________
> >>juniper-nsp mailing list juniper-nsp at puck.nether.net
> >>https://puck.nether.net/mailman/listinfo/juniper-nsp
> >>
> >
> >
> >
>
>
>


More information about the juniper-nsp mailing list