[j-nsp] trouble setting up link agg between clustered SRX 550 and Cisco 6509

Andy Litzinger Andy.Litzinger at theplatform.com
Mon Aug 19 19:42:38 EDT 2013


I've had some progress while working with JTAC-thought I'd share.

JTAC pointed out that one of the interfaces I was trying to LAG was not even coming online.  The Cisco side seemed to think it auto-negotiated happily to 1000/Full (but still added it to an independent port-channel);  the Juniper side was marking it down with auto-negotiated speed/duplex at 10/Half.

Obviously we suspected a cabling issue.  Rather than drive to the datacenter (a good engineer never leaves their desk, right?) I decided to try and replicate the config on the other node in the cluster.  so I removed the broken interface from LAG1 on the cisco end and made node0 the primary node for RG1.  I then configured the LAG on node1 and lo and behold, it came online immediately.  so I failed the RG1 back to node1 and traffic flowed with no issues.  node0's issue is almost certainly cable related, right?

well...

So I decided to play around with LAG1  on node0 again for a bit to see if I could get it to work.  No dice.  While messing with the config I left both ports out of the LAG on the cisco side and forgot to put the working link back into the LAG.

I decided to press my luck with my new discovery and configure the 2nd LAG group I needed online.  As before I wanted to first set the LAG up on the standby node (node0 at this point) and make sure LACP was happy before putting traffic on it.  The 2nd LAG on node0 came up with both interfaces with no issues.  I failed RG1 back to node 0, and because I forgot to put my port back into LAG1 on the cisco side I started blackholing traffic.  No problem- I failed the RG back to node1 and then went to add the previously working port back to LAG1 on the cisco side.

At this point the interface refused to come up on the SRX side.  ge-0/0/4, which had been working previously, suddenly was acting just like ge-0/0/6.  Now the 'bad-cable' idea was kind of out the window.  I spent another hour on the phone with JTAC.  They wanted me to go down and swap in new cables which I said I could do tomorrow.

In the meantime I made a discovery- If I removed the 'redundant-parent reth0' from both ge-0/0/4 and ge-0/0/6, and commited, the interfaces both immediately came up though not part of the LAG of course.  Then if I re-added them to reth0 and re-commited the config- both links stayed up and immediately formed a proper LACP LAG with the 6509.  WTH?!!

so I failed RG1 over to node0- things ran smoothly.

Now I only had one more LAG to configure, LAG2 on node1.  I added the second interface on both sides of the config and had the same issue- the 6509 thought the new link was a-ok and 1000/Full, but added it as an independent member of the LAG.  The SRX marked the link down with 10/Half.  Once again I removed the 'redundant-parent reth1' statement from both interfaces and commited.  Once again both interfaces came right up (or stayed up in the case of the already working interface).  Next I re-added the statements and re-commited and Voila, a working LAG...

I'll be working more with JTAC tomorrow.  Although things are working I worry that they are in a fragile state and any failure of a LAG member may cause an LACP disagreement between the 6509 and the SRX or force me to remember the weird workaround to get things back online.  Also, although I don't know how reproducible this is for others, it seems like I may have hit a bug somewhere.

-andy


> -----Original Message-----
> From: juniper-nsp [mailto:juniper-nsp-bounces at puck.nether.net] On Behalf
> Of Andy Litzinger
> Sent: Thursday, August 15, 2013 3:55 PM
> To: juniper-nsp at puck.nether.net
> Subject: [j-nsp] trouble setting up link agg between clustered SRX 550 and
> Cisco 6509
> 
> Has anyone had any difficulty creating a port channel between an SRX cluster
> (in this case, SRX 550s) and Cisco switches (in this case 6509s, non-VSS)?
> 
> When I tried to bring up a second link in the link agg group the cisco side put it
> in state "I" which means:  standalone.  It also logged this message:
> %EC-SP-5-CANNOT_BUNDLE_LACP: Gi8/2 is not compatible with aggregators
> in channel 10 and cannot attach to them (flow control send of Gi8/2 is on,
> Gi8/1 is off)
> 
> I did some googling and found a couple articles that seemed to say that the
> SRX doesn't support flow-control so I tried turning it off on the cisco side.:
> interface 8/1 flowcontrol send off
> interface 8/2 flowcontrol send off
> interface po10 flowconftorl send off
> 
> This didn't help and when I shut/no shut the port channel on the cisco side
> the whole thing went offline and wouldn't come back until I rebuilt it.
> 
> any ideas?
> 
> SRX-A connects to 6509-A with 2 physical links bundled into reth0 SRX-B
> connects to 6509-B with 2 physical links bundled into reth0
> 
> SRX side config:
> >show configuration interfaces ge-0/0/4
> gigether-options {
>     redundant-parent reth0;
> }
> > show configuration interfaces ge-0/0/6
> gigether-options {
>     redundant-parent reth0;
> }
> > show configuration interfaces ge-9/0/4
> gigether-options {
>     redundant-parent reth0;
> }
> > show configuration interfaces ge-9/0/6
> gigether-options {
>     redundant-parent reth0;
> }
> 
> > show configuration interfaces reth0
> vlan-tagging;
> redundant-ether-options {
>     redundancy-group 1;
>     lacp {
>         active;
>         periodic fast;
>     }
> }
> unit x {
>     vlan-id x;
>     family inet {
>         address x.x.x.x/zz;
>     }
> }
> unit y {
>     vlan-id y;
>     family inet {
>         address x.x.x.x/zz;
>     }
> }
> 
> 
> cisco side on 6509-A:
> interface GigabitEthernet8/1
> description srx01-g0/4
> switchport
> switchport trunk encapsulation dot1q
> switchport trunk allowed vlan x,y
> switchport mode trunk
> switchport nonegotiate
> spanning-tree portfast edge trunk
> channel-group 10 mode active
> end
> 
> interface GigabitEthernet8/2
> description srx01-g0/6
> switchport
> switchport trunk encapsulation dot1q
> switchport trunk allowed vlan x,y
> switchport mode trunk
> switchport nonegotiate
> shutdown
> spanning-tree portfast edge trunk
> channel-group 10 mode passive
> end
> 
> interface Port-channel10
> description srx01-internal
> switchport
> switchport trunk encapsulation dot1q
> switchport trunk allowed vlan x,y
> switchport mode trunk
> switchport nonegotiate
> spanning-tree portfast edge trunk
> end
> 
> the 6509-B config is identical
> 
> thanks!
> -andy
> 
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



More information about the juniper-nsp mailing list