[j-nsp] SRX1500 cluster issues
Floris Termorshuizen
floris at nedcomp.nl
Mon Jan 20 04:50:10 EST 2020
OK that looks good, on my production devices I don’t see ping loss on a dataplane failover, might be a virtual lab issue, not sure 😊.
But I would remove reth0 from RG interface monitor, not sure why there is a message in the logs, but I do not see the need to monitor the reth. The reth would only go down in case both your connections (ge-0/0/4 and ge-7/0/4) go down, I can not imagine what a failover would add or resolve in that case. All config examples only add physical interfaces to monitor, never a virtual interface like a reth.
Best regards,
Floris
From: Khan Muddassir <mechkhans at gmail.com>
Sent: maandag 20 januari 2020 10:34
To: Floris Termorshuizen <floris at nedcomp.nl>; Muhammad Atif Jauhar <atif.jauhar at gmail.com>
Cc: juniper-nsp at puck.nether.net
Subject: Re: [j-nsp] SRX1500 cluster issues
I set this up in virtual lab ( so not sure how accurate are the results) :
i) a monitored interface with weight 255 on primary node goes down. (disabled it)
ii) min-link did not show up in logs.
iii) failover of RG1 takes place, and 'show chassis cluster status' shows node1 as primary now and node0 as secondary.
reth0 flaps, i.e, goes down and comes back up within a second and my ping from host loses 1 ping with default timeout. (dataplane looks fine)
iv) The current/now primary node is the previous secondary node and things look to work as expected.
They were two config misses in this setup:
i) reth0 was not tied to a RG, I kept seeing `no RG is attached to reth0` in `show log jsrpd`: configured it as below:
set chassis cluster redundancy-group 1 interface-monitor reth0 weight 255
ii) monitoring objects were not set with correct weight: changed from 100 to 255.
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/4 weight 255
set chassis cluster redundancy-group 1 interface-monitor ge-7/0/4 weight 255
so with this things look to work. Thanks for your help. :-)
On Mon, Jan 20, 2020 at 4:48 PM Khan Muddassir <mechkhans at gmail.com<mailto:mechkhans at gmail.com>> wrote:
Thanks, Floris. The weight of 255 makes perfect sense.
I did come across this, which answers the default min-link is 1 for reth's, happens to be it only monitors child links of primary node.
https://www.juniper.net/documentation/en_US/junos/topics/topic-map/security-chassis-cluster-redundant-ethernet-lag-interfaces.html
Redundant Ethernet interface configuration also includes a minimum-links setting that allows you to set a minimum number of physical child links on the primary node in a given redundant Ethernet interface that must be working for the interface to be up. The default minimum-links value is 1. Note that the minimum-links setting only monitors child links on the primary node. Redundant Ethernet interfaces do not use physical interfaces on the backup node for either ingress or egress traffic.
I am confused about "Interfaces on the passive node will not pass traffic" , is the chronology following?
i) a monitored interface with weight 255 on primary node goes down.
ii) min-link by default is requiring one interface, and that too of primary node, so reth0 will go down.
iii) failover of RG1 takes place, and 'show chassis cluster status' shows node1 as primary now and node0 as secondary.
iv) The current/now primary node is the previous passive node as you write, does the traffic stall? does not make sense right if its not forwarding then why failover :-) and if reth0 remains down, then basically its an outage as it holds the ownership of acting as a default gateway for multiple vlans.
What I am looking for is, if one of the monitored interface goes down, reth0 remains up and keeps working irrespective of which node holds the primary/secondary ownership.
so from config perspective, I understand there are gaps:
i) weight not being 255.
config is plain and simple:
set chassis cluster reth-count 2 (although only one is in use)
set chassis cluster heartbeat-interval 1000
set chassis cluster heartbeat-threshold 3
set chassis cluster redundancy-group 1 node 0 priority 100
set chassis cluster redundancy-group 1 node 1 priority 1
set chassis cluster redundancy-group 1 preempt
set chassis cluster redundancy-group 1 gratuitous-arp-count 4
set chassis cluster redundancy-group 1 interface-monitor xe-0/0/16 weight 100
set chassis cluster redundancy-group 1 interface-monitor xe-7/0/16 weight 100
set chassis cluster redundancy-group 0 node 0 priority 100
set chassis cluster redundancy-group 0 node 1 priority 1
set interfaces reth0 vlan-tagging
set interfaces reth0 redundant-ether-options redundancy-group 1
I have multiple vlans on reth0 all with below config:
set interfaces reth0 unit X vlan-id X
set interfaces reth0 unit X family inet filter input PB
set interfaces reth0 unit X family inet sampling input
set interfaces reth0 unit X family inet sampling output
set interfaces reth0 unit X family inet address x.x.x.x/yy
thanks )
On Mon, Jan 20, 2020 at 4:23 PM Floris Termorshuizen <floris at nedcomp.nl<mailto:floris at nedcomp.nl>> wrote:
Hi Muddasir,
Two things to keep in mind:
- The redundancy group has a threshold of 255, when it reaches 0 (The weight of each interface configured under interface-monitor gets substracted) the RG fails over to the other node.
- Interfaces on the passive node will not pass traffic (as far as I know).
With your current configuration if interface xe-0/0/16 goes down (on the primary node), no failover occurs because there is a weight of 100, and the reth goes down (or is up but not passing traffic, not sure what happens)
The solution is to make sure the RG1 failover threshold gets reached when needed. This might depend on your exact configuration and your wishes. So if you want to failover when 1 interface goes down, configure a weight of 255, if you have 4 interfaces connected to two switches you might configure a weight of 128 per interface (so when two interfaces go down the total weight is 256 and the threshold is reached).
Now about the LACP: There is some form of LACP involved in the reth interfaces, for example if you create a reth with 4 interfaces connected to two switches you need to configure two LACP bundle's (one per firewall node) on the switches. So I'm not surprised you would see this in te log's.
HTH,
Floris
-----Original Message-----
From: juniper-nsp <juniper-nsp-bounces at puck.nether.net<mailto:juniper-nsp-bounces at puck.nether.net>> On Behalf Of Khan Muddassir
Sent: maandag 20 januari 2020 05:18
To: juniper-nsp at puck.nether.net<mailto:juniper-nsp at puck.nether.net>
Subject: [j-nsp] SRX1500 cluster issues
Hello,
I run a chassis cluster of 2x SRX1500 devices and monitor two interfaces (one from each node) in redundancy-group 1:
set chassis cluster redundancy-group 1 interface-monitor xe-0/0/16 weight
100
set chassis cluster redundancy-group 1 interface-monitor xe-7/0/16 weight
100
An issue recently took down xe-0/0/16 and the reth0 interface went down! I was expecting that xe-7/0/16 will keep the reth interface up and running. I do not have LACP enabled on this cluster, however, I can see in the log that kernel throws out this message stating mini-links not met? Confused as to how JunOS decides to show this up without LACP or any sort of min-links config for reth0 (as well as no config of min-links on the box)
/kernel: ae_bundlestate_ifd_change: bundle reth0: bundle IFD minimum bandwidth or minimum links not met, Bandwidth (Current : Required) 0 : 1 Number of links (Current : Required) 0 : 1
Is this expected where reth0 internally runs some sort of min-link code? It is clear that if does that, its incorrect as another interface is available for its operation.
set interfaces xe-0/0/16 gigether-options redundant-parent reth0 set interfaces xe-7/0/16 gigether-options redundant-parent reth0
any thoughts?
thanks in advance,
-muddasir
_______________________________________________
juniper-nsp mailing list juniper-nsp at puck.nether.net<mailto:juniper-nsp at puck.nether.net> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp at puck.nether.net<mailto:juniper-nsp at puck.nether.net>
https://puck.nether.net/mailman/listinfo/juniper-nsp
More information about the juniper-nsp
mailing list