[j-nsp] Rock-solid JUNOS for QFX5100

Karl Gerhard karl_gerh at gmx.at
Mon Aug 19 07:53:17 EDT 2019


Just here to tell you that we've had an issue that sounds close to what you saw:
After a commit, a LAG would stop moving packets. Renaming the LAG (i.e. from ae24 to ae35) would fix the issue. Renaming it back to ae24 would trigger the issue again.

Happened on a device that was only used for switching, no IP addresses were configured apart form the management port. A reboot fixed the issue. This was before we used them in prod and since then we've upgraded to 17.3R3-S1 and never saw that issue again. What you experienced sounds scary. If this happens to us outside of a maintenance window we'll have to throw out all of our QFX5100. :/

Regards
Karl

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
*From:* Ross Halliday [mailto:ross.halliday at wtccommunications.ca]
*Sent:* Monday, August 12, 2019, 3:19 PM
*To:* juniper-nsp at puck.nether.net
*Subject:* [j-nsp] Rock-solid JUNOS for QFX5100

> Dear List,
>
> I'm curious if anybody can recommend a JUNOS release for QFX5100 that is seriously stable. Right now we're on the previously-recommended version 17.3R3-S1.5. Everything's been fine in testing, and suddenly out of the blue there will be weird issues when I make a change. I suspect maybe they are related to VSTP or LAG, or both.
>
> 1. Add a VLAN to a trunk port, all the access ports on that VLAN completely stopped moving packets. Disable/delete disable all of the broken interfaces restored function. This happened during the day. I opened a JTAC ticket and they'd never heard of an issue like this, of course we couldn't reproduce it. I no longer recall with confidence, but I think the trunk port may have been a one-member LAG (replacement of a downstream switch).
>
> 2. New trunk port (a two-port LACP LAG) not sending VSTP BPDUs for some VLANs. I'm not sure if it was coincidence or always broken as I had recently began feeding new VSTP BPDUs (thus the root bridge changed) before I even looked at this. Other trunk ports did not exhibit the same issue. Completely deleted the LAG and rolled back to fix. This was on a fresh turnup and luckily wasn't in a topology that could form a loop.
>
> Features I'm using include:
>
> - BGP
> - OSPF
> - PIM
> - VSTP
> - LACP
> - VRRP
> - IGMPv2 and v3
> - Routing-instance
> - CoS for multicast
> - CoS for unicast
> - CoS classification by ingress filter
> - IPv4-only
> - ~7k routes in FIB (total of all tables)
> - ~1k multicast groups
>
>
> There are no automation features, no MPLS, no MC-LAG, no EVPN, VXLAN, etc. These switches are L3 boxes that hand off IP to an MX core. Management is in the default instance/table, everything else is in a routing instance.
>
> These boxes have us scared to touch them outside of a window as seemingly basic changes risk blowing the whole thing up. Is this a case where an ancient version might be a better choice or is this release a lemon? I recall that JTAC used to recommend two releases, one being for if you didn't require "new features". I find myself stuck between the adages of "If it ain't broke, don't fix it" and "Software doesn't age like wine". Given how poorly multicast seems to be understood by JTAC I'm very hesitant to upgrade to significantly newer releases.
>
> If anybody can give advice or suggestions I would appreciate it immensely!
>
> Thanks
> Ross
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



More information about the juniper-nsp mailing list