[j-nsp] MC-LAG reliability

Vincent Bernat bernat at luffy.cx
Mon Jan 9 04:10:02 EST 2017


 ❦ 22 décembre 2016 15:15 +0100, Vincent Bernat <bernat at luffy.cx> :

> How reliable should MC-LAG be considered on EX and QFX series (in a pure
> L2 setup)?
>
> I had a few bad experiences with virtual chassis where a hiccup usually
> translates to both switches becoming unavailable. This is pretty rare of
> course. MC-LAG would avoid those coordinated faults but is it otherwise
> as reliable as virtual chassis?

A quick feedback on my tests with MC-LAG: it doesn't work well for
me. While pure L2 operations worked fine, I had huge difficulties
running BGP sessions terminated at the IRB, either by using MAC
synchronization or VRRP (or neither of them). Local IP delivery on the
IRB seems patchy: the MAC address for local delivery to the remove
switch is learned on the ICCP VLAN instead of the IRB VLAN.

I didn't tried too hard, so maybe I missed something. I didn't want an
HA setup for local delivery, so I first tried with neither MAC sync nor
VRRP. Then I have tried MAC synchronization (which was an improvement
from "reproducibly don't work for some BGP sessions" to "work for all
BGP sessions but break after one hour"), then VRRP (without MAC
synchronization). After reverting the whole stuff, even L2 operations
didn't work correctly until a reboot. Maybe only MAC synchronization is
broken and VRRP would have worked.

The issues also affected ICMP handling.

Tested with 14.1X53-D35 on a pair of QFX5100.
-- 
Don't comment bad code - rewrite it.
            - The Elements of Programming Style (Kernighan & Plauger)


More information about the juniper-nsp mailing list