[cisco-voip] CUBE SIP TCP connection timeout

Fri Feb 6 23:37:15 EST 2015

After re-reading my email, perhaps my words were a bit harsh on that section of the SRND :) but I do think the timer statement might need to be revisited and possibly rewritten.

Good weekend to all!

- Dan

On Feb 6, 2015, at 5:38 PM, Daniel Pagan <dpagan at fidelus.com<mailto:dpagan at fidelus.com>> wrote:

Thanks Anthony :)

That’s interesting that it’s suggested to use ICMP for monitoring CUCM up/down state from the perspective of a SIP gateway. I’m surprised and was honestly expecting the doc to be old and certainly not a 10.x SRND since this method doesn’t provide the SIP UA with application-layer-related state information for UCM.

Also… whoever wrote “up to 3 seconds” with default settings should run that in a lab scenario with a stopwatch.

- Dan

From: Anthony Holloway [mailto:avholloway+cisco-voip at gmail.com]
Sent: Friday, February 06, 2015 1:26 AM
To: Daniel Pagan; cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
Subject: Re: [cisco-voip] CUBE SIP TCP connection timeout

Great writeup Daniel!

I found it interesting that the CUCM 10.x SRND states:

SIP Gateway
Redundancy with Cisco IOS SIP gateways can be achieved similarly to H.323. If the SIP gateway cannot establish a connection to the primary Unified CM, it tries a second Unified CM defined under another dial-peer statement with a higher preference.

By default the Cisco IOS SIP gateway transmits the SIP INVITE request 6 times to the Unified CM IP address configured under the dial-peer. If the SIP gateway does not receive a response from that Unified CM, it will try to contact the Unified CM configured under the other dial-peer with a higher preference.

Cisco IOS SIP gateways wait for the SIP 100 response to an INVITE for a period of 500 ms. By default, it can take up to 3 seconds for the Cisco IOS SIP gateway to reach the backup Unified CM. You can change the SIP INVITE retry attempts under the sip-ua configuration by using the command retry invite <number>. You can also change the period that the Cisco IOS SIP gateway waits for a SIP 100 response to a SIP INVITE request by using the command timers trying <time> under the sip-ua configuration.

One other way to speed up the failover to the backup Unified CM is to configure the command monitor probe icmp-ping under the dial-peer statement. If Unified CM does not respond to an Internet Control Message Protocol (ICMP) echo message (ping), the dial-peer will be shut down. This command is useful only when the Unified CM is not reachable. ICMP echo messages are sent every 10 seconds.

Source: http://www.cisco.com/c/en/us/td/docs/voice_ip_comm/cucm/srnd/collab10/collab10/gateways.html#pgfId-1044200

I found this interesting, because it says, "By default, it can take up to 3 seconds for the Cisco IOS SIP gateway to reach the backup Unified CM."  Maybe they meant to drop a "2" in there for 32 seconds?  Well, in the paragraph preceding that statement, they mention the default INVITE retry value of 6; and 6 * 500ms = 3 seconds.  I think they forgot the interval doubling.

Actually, I found it interesting for a second reason as well: they recommend ICMP to monitor the peer, instead of SIP OPTIONS.  I'm not certain if this is a best practice recommendation, an error, or them simply listing one of several possible ways to monitor a SIP peer.

On Thu Feb 05 2015 at 3:54:56 PM Daniel Pagan <dpagan at fidelus.com<mailto:dpagan at fidelus.com>> wrote:
Since we're on the topic of SIP timers, timeouts, etc., I figured why not share some additional information with the list. Below is a write-up of SIP timers T1 and Timer-B I wrote some time ago. Hopefully someone will this useful at some point.

This isn’t mentioned in CUCM service parameter descriptions, but really the SIP TRYING timer represents what’s called the SIP T1 timer. T1 is a baseline value used by another timer called Timer-A, and it’s what controls the intervals which our INVITEs are sent when no response is received. There's also Timer-B, which determines how long to wait, in total, before the request itself should expire, and is calculated by 64 * T1. So if our T1 (TRYING) timer is 500ms, then our Timer-B value is 32 seconds. Note this does not impact SIP over TCP when the TCP socket itself fails to establish since a three-way handshake would first be required.

The important part about the SIP T1 timer and INVITE requests is that T1 does not exactly define how long to wait between re-transmissions. Instead, we multiply T1 to determine how long to wait between INVITE retry attempts. Here’s an excerpt from RFC 3261 describing this:

"If an unreliable transport is being used, the client transaction MUST start timer A with a value of T1. If a reliable transport is being used, the client transaction SHOULD NOT start timer A (Timer A controls request retransmissions).  For any transport, the client transaction MUST start timer B with a value of 64*T1 seconds (Timer B controls transaction timeouts). When timer A fires [expires], the client transaction MUST retransmit the request by passing it to the transport layer, and MUST reset the timer with a value of 2*T1.  …….

When timer A fires [expires] 2 x T1 seconds later, the request MUST be retransmitted again. This process MUST continue so that the request is retransmitted with intervals that double after each transmission. These retransmissions SHOULD only be done while the client transaction is in the "calling" state."

The default value for T1 is 500 ms.  T1 is an estimate of the RTT between the client and server transactions.

What does this mean? Let’s use the default of 500 msecs TRYING and retry INVITE value of 6 as an example:

Time: 0 msecs passed | Start Timer-A (or… Trying timer) – value 500 msecs
INVITE -->
No response received and Trying timer expires.

Time: 500 msecs passed | Start Trying timer – new value 1000 msecs
INVITE -->
No response received and Trying timer expires.

Time: 1500 msecs passed | Start Trying timer – new value 2000 msecs
INVITE -->
No response received and Trying timer expires.

Time: 3500 msecs passed | Start Trying timer – new value 4000 msecs
INVITE -->
No response received and Trying timer expires.

Time: 7500 msecs passed | Start Trying timer – new value 8000 msecs
INVITE -->
No response received and Trying timer expires.

Time: 15500 msecs passed | Start Trying timer – new value 16000 msecs
INVITE -->
No response received and Trying timer expires.
Time required for time-out of our INVITE attempts = 31500 msecs

In other words, the Trying timer will double again and again until either the INVITE retry value is reached or Timer-B expires. With a default of six INVITE retries and a T1/Trying timer of 500 msecs, our total time required until our INVITE retries are exhausted is ~32 seconds. It’s at this point that RouteListControl attempts to route the call through our next Route Group member or Route Group (assuming we're talking about CUCM). If CUBE, then the next preference dial-peer.

Note that SIP over TCP transport still uses these timers, but a reliable transport protocol ensures that a 3-way handshake is performed before delivering the request to the transport layer. CUCM still uses these timers even for SIP requests over TCP despite RFC 3261 saying that Timer-A should not be used for TCP but it’s okay – it doesn’t say it must not be used so it's fair game.

You can read more about this in sections 17.1.1.1 and 17.1.1.2 in RFC 3261 here:
http://www.ietf.org/rfc/rfc3261.txt

In addition to the RFC, there’s an easy to follow article that also describes this process in detail: http://andrewjprokop.wordpress.com/2013/07/02/understanding-sip-timers-part-i/

Hope this helps.

- Dan

-----Original Message-----
From: cisco-voip [mailto:cisco-voip-bounces at puck.nether.net<mailto:cisco-voip-bounces at puck.nether.net>] On Behalf Of Daniel Pagan
Sent: Thursday, February 05, 2015 2:33 PM
To: gentoo at ucpenguin.com<mailto:gentoo at ucpenguin.com>; cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
Subject: Re: [cisco-voip] CUBE SIP TCP connection timeout

If we're talking about transport level timeout, it looks like the command is available in CUBE SP Edition:

"In addition to the SIP protocol-level timers, Cisco Unified Border Element (SP Edition) also allows modification of transport-related timer commands: tcp-connect-timeout (how long TCP SYN will wait for the reply) and tcp-idle-timeout (how long TCP connection should stay active while idle). Although these timers are transport-level values, Cisco IOS XE Release 2.4 supports these timers in SIP only, but not in H.323, nor H.248"

For the tcp-connect-timeout command: "Configures the time (in milliseconds) that SBC waits for a SIP TCP connection to a remote peer to complete before failing that connection. The default timeout interval is 1000 milliseconds."

In practice, specifically at UCM, I wouldn't expect to see a transmitted INVITE in situations where the TCP socket cannot be established, so I'm not sure how well the sip-ua timers below would help.

If it's an option, perhaps using UDP and modifying the retry invite value and TRYING timer as  ucpenguin mentioned below. I'm sure you know of the options keepalive method. One last thing would be to mention the TRYING timer doubles for each INVITE sent without a response until the retry INVITE value is reached (I've seen this cause up to 32 seconds of delay with default values).

Hope this helps.

- Dan

-----Original Message-----
From: cisco-voip [mailto:cisco-voip-bounces at puck.nether.net<mailto:cisco-voip-bounces at puck.nether.net>] On Behalf Of gentoo at ucpenguin.com<mailto:gentoo at ucpenguin.com>
Sent: Thursday, February 05, 2015 1:49 PM
To: cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
Subject: Re: [cisco-voip] CUBE SIP TCP connection timeout

Not sure why this didn't hit the list the first time I sent it, maybe its just slow.

Anyways:

sip-ua
  retry invite 2
  timers trying 100

On 2015-02-05 12:32, Brian Meade wrote:
> Hey all,
>
> Does anyone know a SIP equivalent of "h225 timeout tcp establish"?
>
> The default SIP TCP timeout is 5 seconds:
>
> 001306: Feb  4 20:44:34.164: %VOICE_IEC-3-GW: SIP: Internal Error
> (Socket error): IEC=1.1.186.7.7.4 on callID 3254
> GUID=5BBD7EFBAC0F11E4997499045654EBE2
> 001307: Feb  4 20:44:39.167: %VOICE_IEC-3-GW: SIP: Internal Error
> (Socket error): IEC=1.1.186.7.7.4 on callID 3255
> GUID=5BBD7EFBAC0F11E4997499045654EBE2
>
> This results in failover to a 3rd dial-peer taking 10 seconds when our
> main DC is down.
>
> There's nothing under the "sip-ua" configuration that will change the
> TCP timeout.  It looks like this may not be configurable.
>
> Anyone have any ideas?
>
> Thanks,
> Brian Meade
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-voip

_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-voip

_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net<mailto:cisco-voip at puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-voip