[cisco-voip] Phone Keepalives
CarlosOrtiz at bayviewfinancial.com
CarlosOrtiz at bayviewfinancial.com
Tue Feb 5 14:50:32 EST 2008
Wes,
Thanks for the extra info - makes sense. The phones I have in the UK are
7960's and my phone is a 7961. This actually occured between myself and
someone in the UK` earlier today. Here are the rough stats of me pinging
this phone for about 15 minutes.
Ping statistics for 10.X.X.X:
Packets: Sent = 1434, Received = 1429, Lost = 5 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 133ms, Maximum = 780ms, Average = 166ms
If I were to get a packet drop on one of these SCCP acks, would that
basically kill the connection since the other device would not continue
the conversation? Hence leading me to the "CM Down, Features disabled"
message?
My only recourse at this point is to tell them we need a more reliable
connection, correct?
Wes Sisk <wsisk at cisco.com>
02/05/2008 02:15 PM
To
CarlosOrtiz at bayviewfinancial.com
cc
Robert Kulagowski <rkulagow at gmail.com>, cisco-voip at puck.nether.net
Subject
Re: [cisco-voip] Phone Keepalives
Carlos,
'Registration' and timing appears to be topic-de-jure, this is the 4th
conversation about it already today.
There are at least 2 factors at play when it comes to registration. Let
me expand on those:
1. TCP session errors - SCCP/Skinny works over TCP/IP. When TCP transmits
a segment that segment must be acknowledged by the peer. In the case of
an SCCP keepalive exchange:
phone cm
-> sccp ka
<- tcp ack
< sccp ka ack
-> tcp ack
Normal TCP retransmit rules apply. Normal TCP session management also
apply. TCP FIN/RST can abort the session. ICMP messages such as host
unreachable, net unreachable, port unreachable, may also apply. Otherwise
the phone/CM will retransmit until TCP MaxRetransmits. On the 7940/60 TCP
will retransmit up to 5 times for a maximum of 15 seconds (this was the
last value i have documented, it may have changed). On the 3rd gen phones
7941,61,70,71,42,62, etc the maximum retransmit time is much shorter. I've
seen reports as short as 4 retransmits each after 300 ms (less than 2
seconds total). I do not have hard numbers handy on those.
If you have an outage of 15 seconds at the exact instant when phone needs
to send SCCP keepalive then the phone is going to unregister and report
"CM down features disabled". The TCP/IP network must be stable and
working.
2. keepalive errors - This is complete implemented at the SCCP level, so
above TCP. CM allows missing 2x keepalives from the phone, most SCCP
endpoints support missing 1 SCCP KeepAliveAck from CM. These are not
universally supported as seen in CSCef31887.
The vast majority of time we see:
Phone believes it failed because of TCP timeout, TCP reset, or TCP fin.
This is normal since the phone is responsible for initiating SCCP KA. It
has to send data over the network and has to receive a response.
CM believes the phone failed because of "device initiated reset" or
"keepalive timeout". "device initiated reset" is a misnomer, see
CSCsa66536. CM is sitting waiting to receive SCCP KA from the phone. When
the phone does not send then CM aborts the session. Note CM institutes
timeout at the SCCP level (~90 seconds) while the phone institutes timeout
at the TCP level (~15 seconds).
/Wes
CarlosOrtiz at bayviewfinancial.com wrote:
Not the case here as this Subscriber has many other phones registered is
the US with no problems. As Wes said,I suspect a network issue, but I was
hoping to change the keepalive timer for those phones to decrease the
chance that a single missed keepalive would cause the message to appear
and invoke a failover. This way when a someone hangs up the phone a
failover would not be invoked automatically. That's my understanding of
the process anyway......
Carlos
Robert Kulagowski <rkulagow at gmail.com>
Sent by: cisco-voip-bounces at puck.nether.net
02/05/2008 11:50 AM
To
cisco-voip at puck.nether.net
cc
Subject
Re: [cisco-voip] Phone Keepalives
Wes Sisk wrote:
> sccp keepalive interval is a cluster wide parameter.
>
> Sounds like you definitely have spotty network connectivity. Have to
> stabilize that.
But couldn't it also be a runaway process that's hogging CPU? I just
ran into a situation where a javaw process was spiking to 100% often
enough that phones connected to that subscriber were showing "CM Down".
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://puck.nether.net/pipermail/cisco-voip/attachments/20080205/fbed57ad/attachment.html
More information about the cisco-voip
mailing list