[j-nsp] EVPN/VXLAN experience

Andrey Kostin ankost at podolsk.ru
Fri Mar 22 09:46:52 EDT 2019


Thank you Sebastian for sharing your very valuable experience.

Kind regards,
Andrey

Sebastian Wiesinger писал 2019-03-22 04:39:
> * Andrey Kostin <ankost at podolsk.ru> [2019-03-15 20:50]:
>> I'm interested to hear about experience of running EVPN/VXLAN, 
>> particularly
>> with QFX10k as L3 gateway and QFX5k as spine/leaves. As per docs, it 
>> should
>> be immune to any single switch downtime, so might be a candidate to 
>> really
>> redundant design.
> 
> All right here it goes:
> 
> I can't speak for QFX10k as spine but we have QFX5100 Leaf/Spine
> setups with EVPN/VXLAN running right now. Switch downtime is no
> problem at all, we unplugged a running switch, shut down ports,
> unplugged cables between leaf & spine or leaf & client all while there
> was storage traffic (NFS) active in the setup. Worst thing that
> happend was that IOPS went down from 400k/s to 100k/s for 1-3 seconds.
> 
> What did bother us was that you are limited (at least on QFX5100) in
> the amount of "VLANs" (VNIs). We were testing with 30 client
> full-trunk ports per leaf and with that amount you can only provision
> around 500 VLANs before you get errors and basically it seems you run
> out of memory for bridge domains on the switch. This seems to be a
> limitation by the chips used in the QFX5100, at least that's what I
> got when I asked about it.
> 
> You can check if you know where:
> 
> root at SW-A:RE:0% ifsmon -Id | grep IFBD
>          IFBD                       :    12884      0
> 
> root at SW-A:RE:0% ifsmon -Id | grep Bridge
>          Bridge Domain              :     3502       0
> 
> These numbers combined need to be <= 16382.
> 
> And if you get over the limit these nice errors occur:
> 
> dcf_ng_get_vxlan_ifbd_hw_token: Max vxlan ifbd hw token reached 16382
> ifbd_create_node: VXLAN IFBD hw token couldn't be allocated for 
> <xe-...>
> 
> Workaround is to decrease VLANs or trunk config.
> 
> Also you absolutely NEED LACP from servers to the fabric. 17.4 has
> enhancements which will put the client ports in LACP standby when the
> leaf gets separated from all spines.
> 
>> As a downside I see the more complex configuration at least. Adding
>> vlan means adding routing instance etc. There are also other
>> questions, about convergence, scalability, how stable it is and code
>> maturity.
> 
> We have it automated with Ansible. Management access happens over OOB
> (Mgmt) ports and everything is pushed by Ansible playbooks. Ansible
> generates configuration from templates and pushes it to the switches
> via netconf. I never would want to do this by hand. This demands a
> certain level of structuring by every team (network, people doing the
> cabling, server team) but it works out well for structured setups.
> 
> Our switch config looks like this:
> 
> --------------------------------------------------------------------------
> user at sw1-spine-pod1> show configuration
> ## Last commit: 2019-03-11 03:13:49 CET by user
> ## Image name: jinstall-host-qfx-5-flex-17.4R2-S2.3-signed.tgz
> 
> version 17.4R1-S3.3;
> groups {
>     /* Created by Ansible */
>     evpn-defaults { /* OMITTED */ };
>     /* Created by Ansible */
>     evpn-spine-defaults { /* OMITTED */ };
>     /* Created by Ansible */
>     evpn-spine-1 { /* OMITTED */ };
>     /* Created by Ansible - Empty group for maintenance operations */
>     client-interfaces;
> }
> apply-groups [ evpn-defaults evpn-spine-defaults evpn-spine-1 ];
> --------------------------------------------------------------------------
> 
> So everything Ansible does is contained in apply-groups and is hidden. 
> You can
> immediately spot if something is configured by hand.
> 
> For code we're currently running on the 17.4 train which works mostly
> fine, we had a few problems with third party 40G optics but these
> should be fixed in the newest 17.4 service release.
> 
> Also we had a problem where new Spine/Leaf links did not come up but
> these vanished after rebooting/upgrading the spines.
> 
> In daily operations it proves to be quite stable.
> 
> 
> Best Regards
> 
> Sebastian



More information about the juniper-nsp mailing list