[cisco-voip] Phone Keepalives

CarlosOrtiz at bayviewfinancial.com CarlosOrtiz at bayviewfinancial.com
Tue Feb 5 14:50:32 EST 2008


Wes,

Thanks for the extra info - makes sense.  The phones I have in  the UK are 
7960's and my phone is a 7961.  This actually occured between myself and 
someone in the UK` earlier today.  Here are the rough stats of me pinging 
this phone for about 15 minutes.

Ping statistics for 10.X.X.X:
    Packets: Sent = 1434, Received = 1429, Lost = 5 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 133ms, Maximum = 780ms, Average = 166ms


If I were to get a packet drop on one of these SCCP acks, would that 
basically kill the connection since the other device would not continue 
the conversation? Hence leading me to the "CM Down, Features disabled" 
message?

My only recourse at this point is to tell them we need a more reliable 
connection, correct? 







Wes Sisk <wsisk at cisco.com> 
02/05/2008 02:15 PM

To
CarlosOrtiz at bayviewfinancial.com
cc
Robert Kulagowski <rkulagow at gmail.com>, cisco-voip at puck.nether.net
Subject
Re: [cisco-voip] Phone Keepalives






Carlos,

'Registration' and timing appears to be topic-de-jure, this is the 4th 
conversation about it already today. 

There are at least 2 factors at play when it comes to registration.  Let 
me expand on those:
1. TCP session errors - SCCP/Skinny works over TCP/IP. When TCP transmits 
a segment that segment must be acknowledged by the peer.  In the case of 
an SCCP keepalive exchange:
phone                    cm
-> sccp ka 
<- tcp ack
< sccp ka ack
-> tcp ack

Normal TCP retransmit rules apply.  Normal TCP session management also 
apply.  TCP FIN/RST can abort the session. ICMP messages such as host 
unreachable, net unreachable, port unreachable, may also apply.  Otherwise 
the phone/CM will retransmit until TCP MaxRetransmits.  On the 7940/60 TCP 
will retransmit up to 5 times for a maximum of 15 seconds (this was the 
last value i have documented, it may have changed).  On the 3rd gen phones 
7941,61,70,71,42,62, etc the maximum retransmit time is much shorter. I've 
seen reports as short as 4 retransmits each after 300 ms (less than 2 
seconds total).  I do not have hard numbers handy on those.

If you have an outage of 15 seconds at the exact instant when phone needs 
to send SCCP keepalive then the phone is going to unregister and report 
"CM down features disabled".  The TCP/IP network must be stable and 
working.

2. keepalive errors - This is complete implemented at the SCCP level, so 
above TCP.  CM allows missing 2x keepalives from the phone, most SCCP 
endpoints support missing 1 SCCP KeepAliveAck from CM.  These are not 
universally supported as seen in CSCef31887.

The vast majority of time we see:
Phone believes it failed because of TCP timeout, TCP reset, or TCP fin. 
This is normal since the phone is responsible for initiating SCCP KA.  It 
has to send data over the network and has to receive a response.
CM believes the phone failed because of "device initiated reset" or 
"keepalive timeout".  "device initiated reset" is a misnomer, see 
CSCsa66536.  CM is sitting waiting to receive SCCP KA from the phone. When 
the phone does not send then CM aborts the session.  Note CM institutes 
timeout at the SCCP level (~90 seconds) while the phone institutes timeout 
at the TCP level (~15 seconds).

/Wes

CarlosOrtiz at bayviewfinancial.com wrote: 

Not the case here as this Subscriber has many other phones registered is 
the US with no problems.  As Wes said,I suspect a network issue, but I was 
hoping to change the keepalive timer for those phones to decrease the 
chance that a single missed keepalive would cause the message to appear 
and invoke a failover.  This way when a someone hangs up the phone a 
failover would not be invoked automatically.  That's my understanding of 
the process anyway...... 

Carlos 


Robert Kulagowski <rkulagow at gmail.com> 
Sent by: cisco-voip-bounces at puck.nether.net 
02/05/2008 11:50 AM 


To
cisco-voip at puck.nether.net 
cc

Subject
Re: [cisco-voip] Phone Keepalives








Wes Sisk wrote:
> sccp keepalive interval is a cluster wide parameter.
> 
> Sounds like you definitely have spotty network connectivity.  Have to 
> stabilize that.

But couldn't it also be a runaway process that's hogging CPU?  I just 
ran into a situation where a javaw process was spiking to 100% often 
enough that phones connected to that subscriber were showing "CM Down".
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip



_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://puck.nether.net/pipermail/cisco-voip/attachments/20080205/fbed57ad/attachment.html 


More information about the cisco-voip mailing list