[j-nsp] SRX1500 cluster issues
Khan Muddassir
mechkhans at gmail.com
Mon Jan 20 04:34:20 EST 2020
I set this up in virtual lab ( so not sure how accurate are the results) :
i) a monitored interface with weight 255 on primary node goes down.
(disabled it)
ii) min-link did not show up in logs.
iii) failover of RG1 takes place, and 'show chassis cluster status' shows
node1 as primary now and node0 as secondary.
reth0 flaps, i.e, goes down and comes back up within a second and my ping
from host loses 1 ping with default timeout. (dataplane looks fine)
iv) The current/now primary node is the previous secondary node and things
look to work as expected.
They were two config misses in this setup:
i) reth0 was not tied to a RG, I kept seeing `no RG is attached to reth0`
in `show log jsrpd`: configured it as below:
set chassis cluster redundancy-group 1 interface-monitor reth0 weight 255
ii) monitoring objects were not set with correct weight: changed from 100
to 255.
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/4 weight 255
set chassis cluster redundancy-group 1 interface-monitor ge-7/0/4 weight 255
so with this things look to work. Thanks for your help. :-)
On Mon, Jan 20, 2020 at 4:48 PM Khan Muddassir <mechkhans at gmail.com> wrote:
> Thanks, Floris. The weight of 255 makes perfect sense.
>
> I did come across this, which answers the default min-link is 1 for
> reth's, happens to be it only monitors child links of primary node.
>
>
> https://www.juniper.net/documentation/en_US/junos/topics/topic-map/security-chassis-cluster-redundant-ethernet-lag-interfaces.html
>
> Redundant Ethernet interface configuration also includes a minimum-links
> setting that allows you to set a minimum number of physical child links on
> the primary node in a given redundant Ethernet interface that must be
> working for the interface to be up. The default minimum-links value is 1.
> Note that the minimum-links setting only monitors child links on the
> primary node. Redundant Ethernet interfaces do not use physical interfaces
> on the backup node for either ingress or egress traffic.
>
> I am confused about "Interfaces on the passive node will not pass traffic"
> , is the chronology following?
>
> i) a monitored interface with weight 255 on primary node goes down.
> ii) min-link by default is requiring one interface, and that too of
> primary node, so reth0 will go down.
> iii) failover of RG1 takes place, and 'show chassis cluster status' shows
> node1 as primary now and node0 as secondary.
> iv) The current/now primary node is the previous passive node as you
> write, does the traffic stall? does not make sense right if its not
> forwarding then why failover :-) and if reth0 remains down, then basically
> its an outage as it holds the ownership of acting as a default gateway for
> multiple vlans.
>
> What I am looking for is, if one of the monitored interface goes down,
> reth0 remains up and keeps working irrespective of which node holds the
> primary/secondary ownership.
>
> so from config perspective, I understand there are gaps:
>
> i) weight not being 255.
>
> config is plain and simple:
>
> set chassis cluster reth-count 2 (although only one is in use)
> set chassis cluster heartbeat-interval 1000
> set chassis cluster heartbeat-threshold 3
> set chassis cluster redundancy-group 1 node 0 priority 100
> set chassis cluster redundancy-group 1 node 1 priority 1
> set chassis cluster redundancy-group 1 preempt
> set chassis cluster redundancy-group 1 gratuitous-arp-count 4
> set chassis cluster redundancy-group 1 interface-monitor xe-0/0/16 weight
> 100
> set chassis cluster redundancy-group 1 interface-monitor xe-7/0/16 weight
> 100
> set chassis cluster redundancy-group 0 node 0 priority 100
> set chassis cluster redundancy-group 0 node 1 priority 1
>
> set interfaces reth0 vlan-tagging
> set interfaces reth0 redundant-ether-options redundancy-group 1
>
> I have multiple vlans on reth0 all with below config:
>
> set interfaces reth0 unit X vlan-id X
> set interfaces reth0 unit X family inet filter input PB
> set interfaces reth0 unit X family inet sampling input
> set interfaces reth0 unit X family inet sampling output
> set interfaces reth0 unit X family inet address x.x.x.x/yy
>
> thanks )
>
>
> On Mon, Jan 20, 2020 at 4:23 PM Floris Termorshuizen <floris at nedcomp.nl>
> wrote:
>
>> Hi Muddasir,
>>
>> Two things to keep in mind:
>> - The redundancy group has a threshold of 255, when it reaches 0 (The
>> weight of each interface configured under interface-monitor gets
>> substracted) the RG fails over to the other node.
>> - Interfaces on the passive node will not pass traffic (as far as I know).
>>
>> With your current configuration if interface xe-0/0/16 goes down (on the
>> primary node), no failover occurs because there is a weight of 100, and the
>> reth goes down (or is up but not passing traffic, not sure what happens)
>>
>> The solution is to make sure the RG1 failover threshold gets reached when
>> needed. This might depend on your exact configuration and your wishes. So
>> if you want to failover when 1 interface goes down, configure a weight of
>> 255, if you have 4 interfaces connected to two switches you might configure
>> a weight of 128 per interface (so when two interfaces go down the total
>> weight is 256 and the threshold is reached).
>>
>> Now about the LACP: There is some form of LACP involved in the reth
>> interfaces, for example if you create a reth with 4 interfaces connected to
>> two switches you need to configure two LACP bundle's (one per firewall
>> node) on the switches. So I'm not surprised you would see this in te log's.
>>
>> HTH,
>> Floris
>>
>> -----Original Message-----
>> From: juniper-nsp <juniper-nsp-bounces at puck.nether.net> On Behalf Of
>> Khan Muddassir
>> Sent: maandag 20 januari 2020 05:18
>> To: juniper-nsp at puck.nether.net
>> Subject: [j-nsp] SRX1500 cluster issues
>>
>> Hello,
>>
>> I run a chassis cluster of 2x SRX1500 devices and monitor two interfaces
>> (one from each node) in redundancy-group 1:
>>
>> set chassis cluster redundancy-group 1 interface-monitor xe-0/0/16 weight
>> 100
>> set chassis cluster redundancy-group 1 interface-monitor xe-7/0/16 weight
>> 100
>>
>> An issue recently took down xe-0/0/16 and the reth0 interface went down!
>> I was expecting that xe-7/0/16 will keep the reth interface up and running.
>> I do not have LACP enabled on this cluster, however, I can see in the log
>> that kernel throws out this message stating mini-links not met? Confused as
>> to how JunOS decides to show this up without LACP or any sort of min-links
>> config for reth0 (as well as no config of min-links on the box)
>>
>> /kernel: ae_bundlestate_ifd_change: bundle reth0: bundle IFD minimum
>> bandwidth or minimum links not met, Bandwidth (Current : Required) 0 : 1
>> Number of links (Current : Required) 0 : 1
>>
>> Is this expected where reth0 internally runs some sort of min-link code?
>> It is clear that if does that, its incorrect as another interface is
>> available for its operation.
>>
>> set interfaces xe-0/0/16 gigether-options redundant-parent reth0 set
>> interfaces xe-7/0/16 gigether-options redundant-parent reth0
>>
>> any thoughts?
>>
>> thanks in advance,
>> -muddasir
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
>
More information about the juniper-nsp
mailing list