[e-nsp] Packet loss/drop investigation (Resolved)

Youssef Ghorbal youssef.ghorbal at gmail.com
Wed Aug 22 11:27:38 EDT 2012


Hello,

 Extreme Support guys just found out the root cause of this issue.
 It's something related to Netlogin, Spanning tree and auto-bind feature.

 Netlogin is configured to use "Default" vlan as it's default vlan :
 configure netlogin vlan Default

 It happens that by default "auto-bind" is enabled on the "Default" vlan :
 enable stpd s0 auto-bind vlan default

 Whenever an authenticated mac address gets aged, the port gets moved
back to the Default vlan (Netlogin working as expected here), This
event triggers a topology change (due to the auto-bind on Default
vlan)

 Here we have two unexpected behaviours :
 1 - The auto-bind should not be active when the s0 instance is
disabled (which was the case)
 2 - When a topology change is triggerd, the switch is supposed to go
to a "temporary flood" mode but it does not. It drops the packets
instead of flooding them (until stp converges again, which take ~10s)
 => Engeneering is working of those two.

 Workaround : disable auto-bind on Default vlan, or use another vlan
for Netlogin.

Youssef
----------------------
On Fri, Aug 10, 2012 at 7:20 PM, Youssef Ghorbal
<youssef.ghorbal at gmail.com> wrote:
> More on the subject.
> I've done mirroring on the port directly connected to the phone, the
> traffic dump shows clearly egress packet loss for 10s.
> Now, I've done mirroring on the uplink port of the stack and the
> traffic dump does not show any ingress packet loss.
>
> => packets do arrive correctly from upstream switch and get dropped
> somehow on the way between the ingress port (the uplink) and the
> egress port (connected to the phone)
>
> The traffic dump shows also that during the 10s packet loss a huge
> rate of arp requests hit the switch (~2000/s)
>
> This leads to some DOS protect features that get triggred on the
> stack, the problem is that I have no DOS/flood protection enabled in
> the configuration. Also, all DOS/Flood related counters I'm aware of,
> are null (show dos-protect, show ip-security, show ports rate-limit,
> show ports congestion)
>
> I'll try to generate fake arp at high speed rates and see if the
> problem triggers. If the problem is reproductible at will, it will be
> more easy to push diag further.
>
> Youssef
> --------------------
> On Thu, Aug 9, 2012 at 6:15 PM, Youssef Ghorbal
> <youssef.ghorbal at gmail.com> wrote:
>>> Well, the only what I can propose is to mirror port where phone is and
>>> dump the traffic. Than - analzye.
>>
>> I'm pretty sure that there is packet loss. The traffic dump will
>> confirm that but I'm not sure I'll be able to see where it's
>> happening.
>>
>> I'll do it anyway to confirm the packet loss and see if there any
>> 802.1p pause packets flying around. After that I'll give Erik's
>> suggestion a shot.
>>
>> I'll let you know whenever I found out the root cause of this.
>>
>> Youssef


More information about the extreme-nsp mailing list