<br><font size=2 face="sans-serif">Wes,</font>
<br>
<br><font size=2 face="sans-serif">Thanks for the extra info - makes sense.
The phones I have in the UK are 7960's and my phone is a 7961.
This actually occured between myself and someone in the UK` earlier
today. Here are the rough stats of me pinging this phone for about
15 minutes.</font>
<br>
<br><font size=2 face="Courier New">Ping statistics for 10.X.X.X:</font>
<br><font size=2 face="Courier New"> <b>Packets: Sent = 1434,
Received = 1429, Lost = 5</b> (0% loss),</font>
<br><font size=2 face="Courier New">Approximate round trip times in milli-seconds:</font>
<br><font size=2 face="Courier New"> Minimum = 133ms, Maximum
= <b>780ms</b>, <b>Average = 166ms</b></font>
<br>
<br>
<br><font size=2 face="sans-serif">If I were to get a packet drop on one
of these SCCP acks, would that basically kill the connection since the
other device would not continue the conversation? Hence leading me to the
"CM Down, Features disabled" message?</font>
<br>
<br><font size=2 face="sans-serif">My only recourse at this point is to
tell them we need a more reliable connection, correct? </font>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>Wes Sisk <wsisk@cisco.com></b>
</font>
<p><font size=1 face="sans-serif">02/05/2008 02:15 PM</font>
<td width=59%>
<table width=100%>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td><font size=1 face="sans-serif">CarlosOrtiz@bayviewfinancial.com</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td><font size=1 face="sans-serif">Robert Kulagowski <rkulagow@gmail.com>,
cisco-voip@puck.nether.net</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: [cisco-voip] Phone Keepalives</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><font size=3>Carlos,<br>
<br>
'Registration' and timing appears to be topic-de-jure, this is the 4th
conversation about it already today. <br>
<br>
There are at least 2 factors at play when it comes to registration. Let
me expand on those:<br>
1. TCP session errors - SCCP/Skinny works over TCP/IP. When TCP transmits
a segment that segment must be acknowledged by the peer. In the case
of an SCCP keepalive exchange:<br>
phone cm<br>
-> sccp ka <br>
<- tcp ack<br>
< sccp ka ack<br>
-> tcp ack<br>
<br>
Normal TCP retransmit rules apply. Normal TCP session management
also apply. TCP FIN/RST can abort the session. ICMP messages such
as host unreachable, net unreachable, port unreachable, may also apply.
Otherwise the phone/CM will retransmit until TCP MaxRetransmits.
On the 7940/60 TCP will retransmit up to 5 times for a maximum of
15 seconds (this was the last value i have documented, it may have changed).
On the 3rd gen phones 7941,61,70,71,42,62, etc the maximum retransmit
time is much shorter. I've seen reports as short as 4 retransmits each
after 300 ms (less than 2 seconds total). I do not have hard numbers
handy on those.<br>
<br>
If you have an outage of 15 seconds at the exact instant when phone needs
to send SCCP keepalive then the phone is going to unregister and report
"CM down features disabled". The TCP/IP network must be
stable and working.<br>
<br>
2. keepalive errors - This is complete implemented at the SCCP level, so
above TCP. CM allows missing 2x keepalives from the phone, most SCCP
endpoints support missing 1 SCCP KeepAliveAck from CM. These are
not universally supported as seen in CSCef31887.<br>
<br>
The vast majority of time we see:<br>
Phone believes it failed because of TCP timeout, TCP reset, or TCP fin.
This is normal since the phone is responsible for initiating SCCP
KA. It has to send data over the network and has to receive a response.<br>
CM believes the phone failed because of "device initiated reset"
or "keepalive timeout". "device initiated reset"
is a misnomer, see CSCsa66536. CM is sitting waiting to receive SCCP
KA from the phone. When the phone does not send then CM aborts the
session. Note CM institutes timeout at the SCCP level (~90 seconds)
while the phone institutes timeout at the TCP level (~15 seconds).<br>
<br>
/Wes<br>
</font><font size=3 color=blue><u><br>
</u></font><a href=mailto:CarlosOrtiz@bayviewfinancial.com><font size=3 color=blue><u>CarlosOrtiz@bayviewfinancial.com</u></font></a><font size=3>
wrote: </font>
<br><font size=2 face="sans-serif"><br>
Not the case here as this Subscriber has many other phones registered is
the US with no problems. As Wes said,I suspect a network issue, but
I was hoping to change the keepalive timer for those phones to decrease
the chance that a single missed keepalive would cause the message to appear
and invoke a failover. This way when a someone hangs up the phone
a failover would not be invoked automatically. That's my understanding
of the process anyway......</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
Carlos</font><font size=3> <br>
<br>
</font>
<table width=100%>
<tr valign=top>
<td width=54%><font size=1 face="sans-serif"><b>Robert Kulagowski </b></font><a href=mailto:rkulagow@gmail.com><font size=1 color=blue face="sans-serif"><b><u><rkulagow@gmail.com></u></b></font></a><font size=1 face="sans-serif">
<br>
Sent by: </font><a href="mailto:cisco-voip-bounces@puck.nether.net"><font size=1 color=blue face="sans-serif"><u>cisco-voip-bounces@puck.nether.net</u></font></a><font size=3>
</font>
<p><font size=1 face="sans-serif">02/05/2008 11:50 AM</font><font size=3>
</font>
<td width=45%>
<br>
<table width=100%>
<tr valign=top>
<td width=19%>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td width=80%><a href="mailto:cisco-voip@puck.nether.net"><font size=1 color=blue face="sans-serif"><u>cisco-voip@puck.nether.net</u></font></a><font size=3>
</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: [cisco-voip] Phone Keepalives</font></table>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=50%>
<td width=50%></table>
<br></table>
<br><font size=3><br>
<br>
</font><font size=2><tt><br>
Wes Sisk wrote:<br>
> sccp keepalive interval is a cluster wide parameter.<br>
> <br>
> Sounds like you definitely have spotty network connectivity. Have
to <br>
> stabilize that.<br>
<br>
But couldn't it also be a runaway process that's hogging CPU? I just
<br>
ran into a situation where a javaw process was spiking to 100% often <br>
enough that phones connected to that subscriber were showing "CM Down".<br>
_______________________________________________<br>
cisco-voip mailing list</tt></font><font size=2 color=blue><tt><u><br>
</u></tt></font><a href="mailto:cisco-voip@puck.nether.net"><font size=2 color=blue><tt><u>cisco-voip@puck.nether.net</u></tt></font></a><font size=2 color=blue><tt><u><br>
</u></tt></font><a href="https://puck.nether.net/mailman/listinfo/cisco-voip"><font size=2 color=blue><tt><u>https://puck.nether.net/mailman/listinfo/cisco-voip</u></tt></font></a><font size=3><br>
</font>
<br><font size=3><tt><br>
</tt></font>
<hr><font size=3><tt><br>
_______________________________________________<br>
cisco-voip mailing list<br>
</tt></font><a href="mailto:cisco-voip@puck.nether.net"><font size=3 color=blue><tt><u>cisco-voip@puck.nether.net</u></tt></font></a><font size=3><tt><br>
</tt></font><a href="https://puck.nether.net/mailman/listinfo/cisco-voip"><font size=3 color=blue><tt><u>https://puck.nether.net/mailman/listinfo/cisco-voip</u></tt></font></a><font size=3><tt><br>
</tt></font>
<br>