[c-nsp] Cisco Wireless VOIP IP phone issue

Beck, Andre cisco-nsp at ibh.net
Wed Apr 10 17:27:59 EDT 2013


On Thu, Mar 28, 2013 at 01:09:37PM -0400, Zach Hill wrote:
> Does anyone know if there is a way to make a IP Phone update it's
> access-point more often? We're having an issue where traveling between two
> access points while on a call doesn't seem to poll for the strongest
> wireless signal often enough. The symptoms include bi-directional voice
> issues due to the low signal strength.

While the 7925 is in fact not roaming perfectly, it isn't that bad,
provided it gets enough information from the wireless side (for instance
AP transmit power so it can behave accordingly and adapt its own transmit
power to avoid one-way issues).
> It seems the phones attempt to stay with the access-point they were on when
> the call was made until it cannot reach it at all even at huge signal loss.

That's usually not what I see. I had a year-long TAC case (or rather bunch
of TAC cases) where we got quite some knowledge about the roaming algorithm
in those phones, and what remains unsolved is:

1) Scanning too slowly. The 7925 in 2.4GHz will do an off-channel
   active scan every 2s for exactly one channel, in 5GHz it will do
   that every 1s. That's not really agile enough, specifically in 5GHz
   with a lot of (bonded) channels. It optimizes for hot channels, but
   in general, it easily loses track when you run around a corner in
   a really bad building. Typical effect is a voice gap of 2s and up
   to 6s when it goes full rescan.

   Other vendor's phones scan at a higher rate, or scan at a similar
   rate but when going off-channel, they scan for multiple channels
   at once (there's quite some time between two RTP frames).

2) Running blindly into the abyss. The 7925, when losing the current
   associated AP (meaning RTP packet loss [up or down] exceeds a certain
   treshold, which is not very high) will try to roam to the next
   AP it considers good. Sadly, this "goodness" is stale historic
   data (because of (1) it may be 6s old in 2.4GHz and even older
   in 5GHz). There is absolutely no guarantee this AP is still
   reachable, yet the 7925 starts .11 Assoc with it blindly, exposing
   itself to standard .11 management frame timeouts. If *that*
   goes wrong, be prepared for another 2s gap, or a full rescan with
   a 6s or such gap.

   I proposed to Cisco they should probe the AP before actually
   roaming to it, this would be just one unicast ProbeRequest (maybe
   even broadcast on the correct channel, would help to refresh stale
   data at the same time) and would avoid the jump into the void. I
   don't think it has been implemented.

   Other vendors do pre-roam probes - imagine my relief when I found
   a certain device that shows better roaming behavior than 7925s
   to do exactly what I proposed to Cisco a year before (unexpectedly
   found it probing before roam when analyzing a wireless sniffer trace).

What you describe, however, doesn't really sound like 7925s - they
usually roam way more agile than normal STAs. They only fail when
the field changes too fast for them to follow, which sadly can happen
at walking speed in bad buildings. Cisco will then tell you it's your
fault because you didn't follow the deployment guides. Turns out you
simply can't implement them in steel-concrete mazes of little passages,
all alike - passing from one shadow zone (new APs not yet visible) to
another shadow zone (old APs no longer visible) is deadly with a certain
probability, and then blindly roaming to all the other ex-good APs
(all meanwhile shadowed as well) until finally giving up and rescanning
is the typical scenario for a 6(+)s gap.

If you see worse behavior than that, maybe:

* Your 7925s don't see enough of your APs
* Your 7925s see too many of your APs (AFAIR >32 is really bad)
* You use all 5GHz channels and the 7925s scan too slowly to keep up
* You have a strong mismatch between powers AP->Phone and Phone->AP due
  to missing IEs from the infrastructure
* 7925s are missing other hints from the infrastructure (though in our
  endeavours, we had the impression they aren't using anything the Cisco
  WLCs provide like neighbor AP hints or roaming thresholds, so that's
  probably not too much of an issue).

The only thing I can advice: Start wireless sniffing and run around
with 7925s, USB-connected to a laptop, running syslog to the laptop
and elevated debugging to get a roaming log from the phone. Elevated
debugging is really bad for voice quality, don't use it on anything
but test phones and don't give too much on the voice gaps you WILL
have when running with debugging. The interesting result is the roam
log where you can get a feeling for when the phone roams, why it does
(or doesn't) roam, and what happens next. Together with a wireless
sniffer trace (or rather N traces, one per channel) it's only a matter
of months to find the real culprit.

Welcome to VoWLAN...

                    Cool .signatures are so 90s...

-> Andre Beck    +++ ABP-RIPE +++      IBH IT-Service GmbH, Dresden <-

More information about the cisco-nsp mailing list