[j-nsp] EX4550 (Un-)known unicast flooding at session start for up to 100ms

Brant Ian Stevens branto at argentiumsolutions.com
Thu Aug 17 09:53:14 EDT 2017


I've run into some particularly nasty unicast flooding issues on the 
QFX5100 when the Hypervisor and Guest OS versions of JunOS were 
mismatched (14.x and 13.x respectively).

> Tim <mailto:derherrwagner+jnsp at gmail.com>
> August 14, 2017 at 4:04 AM
> Hi Pavel,
>
> not sure if it's related but is very interessting. I checked the mac
> learning log on several 4550 and found the learn/delete indicator
> minute by minute. I think we will increase the entry counts per index
> an look if something getting better (or worse).
>
> Regards,
> Tim
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
> Pavel Lunin <mailto:plunin at gmail.com>
> August 12, 2017 at 5:54 AM
> Not sure this is specifically related to the OP but there is a known
> hardware limitation/feature of some Broadcom chips used in many EX 
> switches
> of the previous generations (including EX4500). They use a hash table to
> reduce MAC lookup length, which was too small in some older JUNOS 
> versions,
> leading to hash-collisions and consequent mac learning failures in some
> scenarios.
>
> Links, worth to check:
>
> https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/mac-lookup-length-edit-ethernet-switching-options.html
> https://forums.juniper.net/t5/Ethernet-Switching/EX4500-show-ethernet-switching-hash-collisions/td-p/204849
> https://yingsnotebook.wordpress.com/2017/03/20/mac-address-learning-problem-in-juniper-ex-switch/
>
>
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
> Aaron Gould <mailto:aaron1 at gvtc.com>
> August 11, 2017 at 8:14 PM
> Sorry to hear that.
>
> I may have mentioned previously to someone else on the list... my 
> EX4550's are rock solid... over 4 years uptime... I run our 
> mirrored/redundant data centers behind them (hp 3par, hypervisors, 
> etc) and our internet cdn caches behind them too...(Akamai, netflix, 
> and that other one)
>
> A pair of 4550's in top of racks virtual chassis'ed together
>
> Here they are...
>
> root at sabn-dcvc-4550> show system uptime | grep "up|fpc"
> fpc0:
> 7:07PM up 1516 days, 5:51, 0 users, load averages: 0.09, 0.11, 0.08
> fpc1:
> 7:07PM up 1516 days, 5:51, 2 users, load averages: 0.22, 0.17, 0.16
>
>
> root at stlr-dcvc-4550> show system uptime | grep "up|fpc"
> fpc0:
> 7:09PM up 1520 days, 4:21, 0 users, load averages: 0.11, 0.15, 0.14
> fpc1:
> 7:09PM up 1520 days, 4:42, 1 user, load averages: 0.25, 0.22, 0.18
>
>
> root at stlr-dcvc-4550> show version | grep "fpc|model|boot"
> fpc0:
> --------------------------------------------------------------------------
> Model: ex4550-32f
> JUNOS Base OS boot [12.2R4.5]
> fpc1:
> --------------------------------------------------------------------------
> Model: ex4550-32f
> JUNOS Base OS boot [12.2R4.5]
>
> ....run a bunch of vlans/stp...
>
> {master:1}
> root at stlr-dcvc-4550> show spanning-tree bridge brief | grep 
> "protocol|Vlan"
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 12
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 1000
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 10
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 11
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 210
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 204
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 202
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 201
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 14
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 13
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 1006
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 101
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 921
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 2
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 2202
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 3
> Enabled protocol : RSTP
>
> STP bridge parameters for VLAN 4
>
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
> Tim <mailto:derherrwagner+jnsp at gmail.com>
> August 11, 2017 at 5:59 AM
> Hi everybody,
>
> the last half year (since we extend some monitoring/logging things) we
> observe multiple times a day strom control triggering across the
> network. We could trace it back, that it happen on a regular basis if
> heavy stream are initiated (database copy jobs, vmotion stuff, etc.)
> it seems that the EX4550 Switches in the data path needs up to 100ms
> for mac learning unit he stops the (un-)known unicast flooding.
> Because of the heavy nature of the traffic streams we are talking
> about 7-8 MB flooded traffic during the 100ms.
>
> The topology is quite simple and straight.
>
> SRV1 - EX4550 - QFX5100 - EX4550 - SRV2
>
> With a sniffing server attached to a normal port on the EX4550 Switch
> near SRV1. Normal port means, that it is of course not a mirror port
> to only got the flooded stuff.
>
> We've open a case at our service partner, but recieved a at least
> disputable answer "Hey, works as designed. Normal behavior like Cisco,
> etc.)"
>
> In my personal 15 years experience with Cisco, Extreme/Enterasys
> Networks and Nortel Networks i've never see this magnitude of MAC
> learning latency / unicast flooding.
> If i imagine that this behavior is normal, it would mean that just a
> dozen concurrent heavy streams would fuck up all other ports and even
> a 40G uplink has no chances and gets heavily overloaded in this time
> frame.
>
> What brings me to my question. Does anyone here have any experience
> with "typical" MAC learning latencies or similar problems with Juniper
> / $vendor like this?
>
> Best regards,
> Tim
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

-- 

-- 
Regards,
--
Brant I. Stevens, Principal & Consulting Architect
branto at argentiumsolutions.com
d:212.931.8566, x101. m:917.673.6536. f:917.525.4759.
http://argentiumsolutions.com



More information about the juniper-nsp mailing list