[cisco-voip] SRST and standby SUB?

Jason Burns burns.jason at gmail.com
Mon May 11 22:51:44 EDT 2009


Scott,

Posting the "Debug Display" and "Status Messages" section of the phone web
page might be helpful. Usually when a phone fails over it throws some error
messages in there.

On Mon, May 11, 2009 at 3:51 PM, Wes Sisk <wsisk at cisco.com> wrote:

>  spot on.
>
> net result is phones with geometric TCP may (will likely?) failover faster
> than any other device.  failover depends on TCP:2000 (or tcp:2443)
> connectivity.
>
>
> On Monday, May 11, 2009 3:36:55 PM, Justin Steinberg
> <jsteinberg at gmail.com> <jsteinberg at gmail.com> wrote:
>
> ok, thanks guys for clearing that up.    Looking at a packet capture from
> an idle phone all I see is the SCCP KA and the SCCP KA ACK every 30 seconds
> by default so there is some dependency on the SCCP KA traffic generating the
> required TCP traffic for Geometric TCP to work.
>
> So theoritically, with your example of 2 ms RTT, if the CM server were to
> lose power at the beginning of the 30 second KA cycle then the IP phone
> would failover after approximately 30.030 seconds with Geometric TCP enabled
> instead of 61.8 seconds with Geometric TCP disabled.
>
> Sorry Scott, didn't mean to take your thread a different direction I just
> found the Geometric TCP interesting.
>
> Justin
>
> On Mon, May 11, 2009 at 2:50 PM, Wes Sisk <wsisk at cisco.com> wrote:
>
>>  you're catching on.  have another espresso and you'll be there.
>>
>> TCP is almost completely independent of the SCCP keepalives.
>>
>> this tends to help folks:
>> SCCP keepalives verify the CM process is still active and processing
>> traffic.  think "application layer"
>> TCP works at the transport/session layer to verify data makes it over the
>> network.
>>
>> this only becomes confusing because the 'network' doesn't really do
>> anything by itself.  the 'application' must initiate data.  at that point
>> the network must reliably deliver the data.  data can be signaling for an
>> inbound call, signaling for an outbound call, signaling for an MWI update,
>> signagling for a shared line update, or data for an SCCP keepalive.
>>
>> purely hypothetical numbers here:
>> Geometric TCP:
>> tcp retransmits very rapidly at multiples of round trip time.  If normal
>> round trip time is 2msec the phone may retransmit at 2, 4, 8, and 16 msec.
>> If phone does not receive a TCP ACK within 2+4+8+16msec = 30msec the phone
>> gives up on the TCP session and fails over.
>>
>> "Slow Failover" or non-Geometric:
>> TCP retransmits at 250,400,750, 1400, 2000, 4000,8000,15000 msec
>> (CSCed01179).  Sum = 31.8 seconds.  Phone can wait 31 seconds for a TCP ACK
>> from CM before giving up on TCP session and failing over.
>>
>> Geometric TCP is perceived particularly poorly if you have fast but
>> unreliable network.
>>
>> /wes
>>
>> On Monday, May 11, 2009 2:24:46 PM, Justin Steinberg
>> <jsteinberg at gmail.com> <jsteinberg at gmail.com> wrote:
>>
>> nevermind, what i wrote doesn't make any sense.
>>
>> I assume it must work something more like, the phones follow the SCCP
>> keepalive timers defined in the ccmadmin service parameters but when the KA
>> ACKs begin to fail the phone must use some 'geometric TCP' logic to force a
>> quicker failover.
>>
>> On Mon, May 11, 2009 at 2:21 PM, Justin Steinberg <jsteinberg at gmail.com>wrote:
>>
>>> interesting.  I must have missed this new addition to 7.2(1).   So, this
>>> means that with Geometric TCP enabled the phones don't really use the SCCP
>>> keepalive timers configured in the CCMADMIN service parameters section?
>>> They basically come up with a baseline TCP RTT and failover after that
>>> interval passes three times without any response?
>>>
>>> On Mon, May 11, 2009 at 1:18 PM, Wes Sisk <wsisk at cisco.com> wrote:
>>>
>>>> One more variable I do not see in Ryan's response - Geometric TCP.  With
>>>> Geometric TCP enabled the phones failover MUCH more aggressively.  With this
>>>> phone may determine CM is down with only 1.5 second network outage.  With
>>>> Geometric TCP disabled phone may wait up to 45 seconds to classify cm as
>>>> "down" and attempt failover.
>>>>
>>>> CSCsm81227.
>>>>
>>>> /Wes
>>>>
>>>> On Monday, May 11, 2009 11:55:08 AM, Ryan Ratliff <rratliff at cisco.com>
>>>> wrote:
>>>>
>>>>> So coming back from SRST there are 3 things that have to happen before
>>>>> the phone will re-register.
>>>>> 1. Establish TCP connection to one of the CMs
>>>>> 2. Wait out the "Connection Monitor Duration" timer so it knows the
>>>>> connection is stable (Enterprise Params, default 120 secs)
>>>>> 3. Request and receive a registration token from the server indicating
>>>>> that it can proceed with registration.
>>>>>
>>>>> Both connection monitor duration and the registration token mechanisms
>>>>> are in place to make sure the failback process is as fast and problem-free
>>>>> as possible.
>>>>>
>>>>> In your case is there anything between the sites that could be proxying
>>>>> the tcp connection from phone to CCM?  All "Standby" means is that there's a
>>>>> TCP session established but it's not actually registered.
>>>>>
>>>>> Your best bet is going to be to reset the phone and get a packet
>>>>> capture.  look at the traffic on tcp 2000 to the CCM servers to see what's
>>>>> going on.
>>>>>
>>>>> -Ryan
>>>>>
>>>>> On May 11, 2009, at 11:43 AM, Scott Voll wrote:
>>>>>
>>>>> no
>>>>> sub
>>>>> pub
>>>>> srst
>>>>>
>>>>>
>>>>> all via IP addresses
>>>>>
>>>>> sub -- Standby
>>>>> pub
>>>>> SRST -- Active
>>>>>
>>>>> Scott
>>>>>
>>>>> On Mon, May 11, 2009 at 8:41 AM, Ryan Ratliff <rratliff at cisco.com>
>>>>> wrote:
>>>>> On the phone what is the list of Communications Manages?  Does it
>>>>> somehow think the SRST is higher in priority than the sub?
>>>>>
>>>>> -Ryan
>>>>>
>>>>>
>>>>> On May 11, 2009, at 11:29 AM, Scott Voll wrote:
>>>>>
>>>>> I'm banging my head on a wall.  What am I missing?
>>>>>
>>>>> have a remote site over IPSEC VPN connection back to central CM
>>>>> cluster.
>>>>>
>>>>> all has been working fine. but today I find that the three phones (all)
>>>>> are in SRST mode.  the VGW is still registered to the cluster. but from the
>>>>> web interface of the phones shows reg'd with SRST and standby with SUB.  How
>>>>> does that work?
>>>>>
>>>>> I don't really understand the logs on the 7942.
>>>>>
>>>>> I can ping from the sub and pub to the phones.  any ideas?
>>>>>
>>>>> I can attach logs from the phone is that helps.
>>>>>
>>>>> Thanks
>>>>>
>>>>> Scott
>>>>> _______________________________________________
>>>>> cisco-voip mailing list
>>>>> cisco-voip at puck.nether.net
>>>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cisco-voip mailing list
>>>>> cisco-voip at puck.nether.net
>>>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>>>>
>>>>
>>>> _______________________________________________
>>>> cisco-voip mailing list
>>>> cisco-voip at puck.nether.net
>>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>>>
>>>
>>>
>>
>>
>
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20090511/3ac0833c/attachment.html>


More information about the cisco-voip mailing list