[j-nsp] EVPN/VXLAN experience (was: EX4600 or QFX5110)

Richard McGovern rmcgovern at juniper.net
Fri Mar 22 12:52:13 EDT 2019

Sebastian, a couple of questions.

1.  Your design is pure QFX5100 Leaf/Spine today?  If yes, I assume you maybe only have 1 flat VXLAN network, that is you have no L3 VXLAN, yes?
2.  You stated you need 17.4 for improved LACP operation.  Which exact 17.4 are you using, and what version were you using previously?  I am wondering if you were ever on 17.3-R3-S3?

Many thanks, Rich

Richard McGovern
Sr Sales Engineer, Juniper Networks 

On 3/22/19, 4:39 AM, "Sebastian Wiesinger" <sebastian at karotte.org> wrote:

    * Andrey Kostin <ankost at podolsk.ru> [2019-03-15 20:50]:
    > I'm interested to hear about experience of running EVPN/VXLAN, particularly
    > with QFX10k as L3 gateway and QFX5k as spine/leaves. As per docs, it should
    > be immune to any single switch downtime, so might be a candidate to really
    > redundant design.
    All right here it goes:
    I can't speak for QFX10k as spine but we have QFX5100 Leaf/Spine
    setups with EVPN/VXLAN running right now. Switch downtime is no
    problem at all, we unplugged a running switch, shut down ports,
    unplugged cables between leaf & spine or leaf & client all while there
    was storage traffic (NFS) active in the setup. Worst thing that
    happend was that IOPS went down from 400k/s to 100k/s for 1-3 seconds.
    What did bother us was that you are limited (at least on QFX5100) in
    the amount of "VLANs" (VNIs). We were testing with 30 client
    full-trunk ports per leaf and with that amount you can only provision
    around 500 VLANs before you get errors and basically it seems you run
    out of memory for bridge domains on the switch. This seems to be a
    limitation by the chips used in the QFX5100, at least that's what I
    got when I asked about it.
    You can check if you know where:
    root at SW-A:RE:0% ifsmon -Id | grep IFBD
             IFBD                       :    12884      0
    root at SW-A:RE:0% ifsmon -Id | grep Bridge
             Bridge Domain              :     3502       0
    These numbers combined need to be <= 16382.
    And if you get over the limit these nice errors occur:
    dcf_ng_get_vxlan_ifbd_hw_token: Max vxlan ifbd hw token reached 16382
    ifbd_create_node: VXLAN IFBD hw token couldn't be allocated for <xe-...>
    Workaround is to decrease VLANs or trunk config.
    Also you absolutely NEED LACP from servers to the fabric. 17.4 has
    enhancements which will put the client ports in LACP standby when the
    leaf gets separated from all spines.
    > As a downside I see the more complex configuration at least. Adding
    > vlan means adding routing instance etc. There are also other
    > questions, about convergence, scalability, how stable it is and code
    > maturity.
    We have it automated with Ansible. Management access happens over OOB
    (Mgmt) ports and everything is pushed by Ansible playbooks. Ansible
    generates configuration from templates and pushes it to the switches
    via netconf. I never would want to do this by hand. This demands a
    certain level of structuring by every team (network, people doing the
    cabling, server team) but it works out well for structured setups.
    Our switch config looks like this:
    user at sw1-spine-pod1> show configuration
    ## Last commit: 2019-03-11 03:13:49 CET by user
    ## Image name: jinstall-host-qfx-5-flex-17.4R2-S2.3-signed.tgz
    version 17.4R1-S3.3;
    groups {
        /* Created by Ansible */
        evpn-defaults { /* OMITTED */ };
        /* Created by Ansible */
        evpn-spine-defaults { /* OMITTED */ };
        /* Created by Ansible */
        evpn-spine-1 { /* OMITTED */ };
        /* Created by Ansible - Empty group for maintenance operations */
    apply-groups [ evpn-defaults evpn-spine-defaults evpn-spine-1 ];
    So everything Ansible does is contained in apply-groups and is hidden. You can
    immediately spot if something is configured by hand.
    For code we're currently running on the 17.4 train which works mostly
    fine, we had a few problems with third party 40G optics but these
    should be fixed in the newest 17.4 service release.
    Also we had a problem where new Spine/Leaf links did not come up but
    these vanished after rebooting/upgrading the spines.
    In daily operations it proves to be quite stable.
    Best Regards
    GPG Key: 0x58A2D94A93A0B9CE (F4F6 B1A3 866B 26E9 450A  9D82 58A2 D94A 93A0 B9CE)
                -- Terry Pratchett, The Fifth Elephant

More information about the juniper-nsp mailing list