[cisco-voip] MGCP - Fallback to SRST very often even though connectivity to CUCM is fine

Tue Nov 3 09:52:51 EST 2009

blah! I forgot to mention the service parameter.  CM Service parameter 
"MGCP Retry Timeout Handling" configures the behavior when a timeout is 
observed.  This allows marking the endpoint oos, resetting just the 
port, unregistering the entire gateway.  Unregistering then entire 
gateway is the default value.

/wes

On Tuesday, November 03, 2009 9:29:24 AM, Wes Sisk <wsisk at cisco.com> wrote:
> timely question.
>
> MGCP gateway can be viewed as:
>
> MGCP Gateway
>     mgcp/udp based registration and keepalives
>     analog endpoints
>        mgcp/udp based registration and transactions
>     digital endpoints
>        backhaul/tcp based
>
> on CM if you see the alarm:
> MGCPGatewayLostComm then the top level mgcp process stopped 
> communicating with CM.  Usually the GW sends keepalives to CM similar to:
> 12/27/2005 10:16:40.173 CCM|MGCPHandler received msg from: 10.10.33.250
> NTFY 333382 *@HQ-VG224-3rdFlr MGCP 0.1
> X: 0
> O:
> |<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017474><IP::10.10.33.250><DEV::> 
>
>
> If CM does not receive keepalive from gateway CM will attempt to query 
> the gw with this message:
> 12/27/2005 10:17:07.002 CCM|MGCPHandler send msg SUCCESSFULLY to: 
> 10.10.31.250
> AUEP 13561613 AALN/S2/0 at HQ-VG224-1stFlr MGCP 0.1
> F: X
> |<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017448><IP::10.10.31.250><DEV::> 
>
>
> This AUEP is not 'normal'.
> F = RequestedInfo
> X = RequestIdentifier
> Normal AUEP requests much more information. This is a special "hello, 
> are you there" type exchange.
>
> the gateway should respond:
> 12/27/2005 10:17:07.002 CCM|MGCPHandler received msg from: 10.10.31.250
> 200 13561613
> X: 2
> |<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017648><IP::10.10.31.250><DEV::> 
>
>
>
> This is getting very close to unregistration.  Another way to look at 
> this is to look for indicates of lost messages to the gateway.  Each 
> MGCP transaction is retransmitted up to 3 times if not ack'd.  You can 
> see retries in the CM SDI traces:
> 01/13/2005 10:34:33.603 CCM|MGCPHandler TransId: 1097943 Timeout. Retry#1
>
> If you see frequent retries then you are intermittently dropping or 
> excessively delaying the UDP packets carrying the MGCP payload.
>
>
> There is also an issue where endpoints may stop responding to CM.  CM 
> will retry the transaction 3 times and then unregister the gateway.  
> This looks similar to the retries tracked above.  The main difference 
> is that you will see valid exchanges with other endpoints on the 
> gateway or you will see successful keepalives with the top level 
> gateway MGCP process.  This was historically caused by CSCsf26617 and 
> similar.  The signature of this failure is repeated retransmits of the 
> DLCX, RQNT, or CRCX messages from CM to the gateway while other 
> endpoints are responding.  If this is happening then the gateway is 
> having an internal error such as resource allocation or dsp hang.
>
> HTH.
>
> /Wes
>
>
>
> On Tuesday, November 03, 2009 3:39:24 AM, Wilson Hew 
> <wilsonhew at gmail.com> wrote:
>> Bob/Ryan, appreciate your feedback. Thanks.
>>
>> Guess I need to look at the connection between my MGCP gateway and 
>> CUCM. Any idea what else I may need to check? I am looking at the SDI 
>> traces, but have no idea what to look at.
>>
>> Thanks,
>> Wil
>>
>> On Tue, Nov 3, 2009 at 2:35 AM, Bob Fronk <bob at btrfronk.com 
>> <mailto:bob at btrfronk.com>> wrote:
>>
>>     I had this happening and found out it was an MPLS circuit going
>>     down.  Due to location of this particular site, our 12mbps MPLS
>>     circuit is supplied by multiple T1s bonded with MLPPP.
>>
>>      
>>
>>     One of the T1s was going up/down several times a day (telco
>>     problem) and each time, the MLPPP would reset for a couple
>>     seconds.   The MGCP gateway responded by going into SRST and the
>>     PRI would go down for a moment.
>>
>>      
>>
>>     Just something to check
>>
>>      
>>
>>     *From:* cisco-voip-bounces at puck.nether.net
>>     <mailto:cisco-voip-bounces at puck.nether.net>
>>     [mailto:cisco-voip-bounces at puck.nether.net
>>     <mailto:cisco-voip-bounces at puck.nether.net>] *On Behalf Of
>>     *Wilson Hew
>>     *Sent:* Monday, November 02, 2009 11:47 AM
>>     *To:* cisco-voip at puck.nether.net <mailto:cisco-voip at puck.nether.net>
>>     *Subject:* [cisco-voip] MGCP - Fallback to SRST very often even
>>     though connectivity to CUCM is fine
>>
>>      
>>
>>     Hello there,
>>
>>     Greetings. I am having problem with my MGCP gateway, and I need
>>     little help and advice. My MGCP gateway is running as SRST, and
>>     it will fallback to SRST very often (twice a day). And it will go
>>     back to normal operation from fallback just after that. The
>>     connectivity from my MGCP gateway (remote site) to CUCM is fine.
>>
>>     I noticed my E1 is going down everytime when it falls back to
>>     SRST - is it considered normal?
>>
>>     My gateway is running 12.4(24)T1 and CUCM version 7.0.2.
>>
>>     In 'sh ccm-manager', I have the below:
>>     --------------------------------------------------------------
>>     TFTP retry count to shut Ports: 2
>>
>>     Statistics:
>>         Packets recvd:   857
>>         Recv failures:   1
>>         Packets xmitted: 852
>>         Xmit failures:   0
>>     --------------------------------------------------------------
>>     In 'sh mgcp stats':
>>
>>      UDP pkts rx 557379, tx 558783
>>      Unrecognized rx pkts 0, MGCP message parsing errors 0
>>      Duplicate MGCP ack tx 9, Invalid versions count 0
>>      CreateConn rx 36256, successful 36249, failed 7
>>      DeleteConn rx 36274, successful 36101, failed 173
>>      ModifyConn rx 66178, successful 66126, failed 52
>>      DeleteConn tx 154, successful 154, failed 0
>>      NotifyRequest rx 54652, successful 54516, failed 136
>>      AuditConnection rx 3, successful 3, failed 0
>>      AuditEndpoint rx 14887, successful 8080, failed 6807
>>      RestartInProgress tx 6248, successful 6248, failed 0
>>      Notify tx 342779, successful 342779, failed 0
>>      ACK tx 201075, NACK tx 7191
>>      ACK rx 349100, NACK rx 0
>>      Collisions: Passive 0, Active 0
>>     --------------------------------------------------------------
>>
>>     Can I tell what is wrong with the above? Apart from that, I see
>>     numbers of slips in controllers e1 increasing, and I have
>>     network-clock-participate configured.
>>
>>     Would appreciate if you could give me your feedback about this.
>>     Any feedback is very much appreciated.
>>
>>     Thanks,
>>     Wil
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>   
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20091103/e8bc1eba/attachment.html>