[cisco-voip] MGCP - Fallback to SRST very often even though connectivity to CUCM is fine

Philip Walenta pwalenta at wi.rr.com
Wed Nov 4 06:07:01 EST 2009


By any chance are there multiple paths to your CUCM systems?

 

From: cisco-voip-bounces at puck.nether.net
[mailto:cisco-voip-bounces at puck.nether.net] On Behalf Of Wilson Hew
Sent: Tuesday, November 03, 2009 10:56 AM
To: Wes Sisk
Cc: cisco-voip at puck.nether.net
Subject: Re: [cisco-voip] MGCP - Fallback to SRST very often even though
connectivity to CUCM is fine

 

One more thing, I saw "CCM|MGCPHandler TransId: 1097943 Timeout. Retry#1" in
the SDI traces, but can't seem to find the #2 or #3 retry, which causes MGCP
gateway reset.

Thanks,
Wil

On Wed, Nov 4, 2009 at 12:47 AM, Wilson Hew <wilsonhew at gmail.com> wrote:

Hello Wes,

Thank you so much for the information. It really benefits me!

Btw, when you say the below AUEP is not 'normal', can you please help to
elaborate?

----------------------------------------------------------------------------
------
AUEP 76267 AALN/S2/SU0/0 at MLP-VG-01 MGCP 0.1
F: X
|<CLID::StandAloneCluster><NID::X.X.X.X><CT::1,100,132,1.204039><IP::X.X.X.X
><DEV::><LVL::Significant><MASK::2000>
----------------------------------------------------------------------------
------

I found that in my SDI traces and I can see AUEP ACK received. However, I
got a shocked when I see this (more than 10 msgs received within second,
together):

----------------------------------------------------------------------------
------
NTFY 129359098 aaln/S2/SU0/3 at MLP-VG-01 MGCP 0.1
N: ca at 172.22.7.1:2427
X: 69
O: L/hd
|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204114><IP::172.
23.8.251><DEV::><LVL::Significant><MASK::2000>
11/03/2009 15:25:38.443
CCM|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204114><MN::M
GCPEndPoint><MV::aaln/S2/SU0/3 at MLP-VG-01><DEV::><LVL::All><MASK::ffff>
11/03/2009 15:25:38.443 CCM|MGCPHandler received msg from: 172.23.8.251

NTFY 129359097 *@MLP-VG-01 MGCP 0.1
X: 0
O: 
|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204115><IP::172.
23.8.251><DEV::><LVL::Significant><MASK::2000>
11/03/2009 15:25:38.443
CCM|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204115><MN::M
GCPEndPoint><MV::*@MLP-VG-01><DEV::><LVL::All><MASK::ffff>
11/03/2009 15:25:38.443 CCM|MGCPHandler received msg from: 172.23.8.251
----------------------------------------------------------------------------
------

Followed by this (seeing phones keep alive timeout):

----------------------------------------------------------------------------
------
11/03/2009 15:25:38.445 CCM|StationInit:   TCPPid=[ 1.100.9.210] Keep alive
timeout.|<CLID::StandAloneCluster><NID::

and the below (is the below trying to tell MGCP gateway restarting?):

11/03/2009 15:25:38.485 CCM|MGCPInit - //// RSIP <restart> from
*@MLP-VG-01|<CLID::StandAloneCluster><NID::

11/03/2009 15:25:38.490 CCM|MGCPManager received DUPLICATE message with
TransId: 129359097|<CLID::StandAloneCluster><NID::
----------------------------------------------------------------------------
------

Lastly, CUCM is sending messages to MGCP gateway (more than 10 msgs received
within second, together):

----------------------------------------------------------------------------
------
|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204110><IP::172.
23.8.251><DEV::><LVL::Significant><MASK::2000>
11/03/2009 15:25:38.756 CCM|MGCPHandler send msg SUCCESSFULLY to:
172.23.8.251
200 129359097 

|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204111><IP::172.
23.8.251><DEV::><LVL::Significant><MASK::2000>
11/03/2009 15:25:38.756 CCM|MGCPHandler send msg SUCCESSFULLY to:
172.23.8.251
200 129359097 
----------------------------------------------------------------------------
------

Looks to me the network is not stable or not consistent during that period.

Any feedback from the gurus out there is very much appreciated!

Thanks,
Wil





On Tue, Nov 3, 2009 at 10:52 PM, Wes Sisk <wsisk at cisco.com> wrote:

blah! I forgot to mention the service parameter.  CM Service parameter "MGCP
Retry Timeout Handling" configures the behavior when a timeout is observed.
This allows marking the endpoint oos, resetting just the port, unregistering
the entire gateway.  Unregistering then entire gateway is the default value.

/wes



On Tuesday, November 03, 2009 9:29:24 AM, Wes Sisk  <mailto:wsisk at cisco.com>
<wsisk at cisco.com> wrote:



timely question.

MGCP gateway can be viewed as:

MGCP Gateway
    mgcp/udp based registration and keepalives
    analog endpoints
       mgcp/udp based registration and transactions
    digital endpoints
       backhaul/tcp based

on CM if you see the alarm:
MGCPGatewayLostComm then the top level mgcp process stopped communicating
with CM.  Usually the GW sends keepalives to CM similar to:
12/27/2005 10:16:40.173 CCM|MGCPHandler received msg from: 10.10.33.250
NTFY 333382 *@HQ-VG224-3rdFlr MGCP 0.1
X: 0
O:
|<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017474><IP::1
0.10.33.250><DEV::> 

If CM does not receive keepalive from gateway CM will attempt to query the
gw with this message:
12/27/2005 10:17:07.002 CCM|MGCPHandler send msg SUCCESSFULLY to:
10.10.31.250
AUEP 13561613 AALN/S2/0 at HQ-VG224-1stFlr MGCP 0.1
F: X
|<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017448><IP::1
0.10.31.250><DEV::> 

This AUEP is not 'normal'.
F = RequestedInfo
X = RequestIdentifier
Normal AUEP requests much more information. This is a special "hello, are
you there" type exchange.

the gateway should respond:
12/27/2005 10:17:07.002 CCM|MGCPHandler received msg from: 10.10.31.250
200 13561613
X: 2
|<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017648><IP::1
0.10.31.250><DEV::> 


This is getting very close to unregistration.  Another way to look at this
is to look for indicates of lost messages to the gateway.  Each MGCP
transaction is retransmitted up to 3 times if not ack'd.  You can see
retries in the CM SDI traces:
01/13/2005 10:34:33.603 CCM|MGCPHandler TransId: 1097943 Timeout. Retry#1

If you see frequent retries then you are intermittently dropping or
excessively delaying the UDP packets carrying the MGCP payload.


There is also an issue where endpoints may stop responding to CM.  CM will
retry the transaction 3 times and then unregister the gateway.  This looks
similar to the retries tracked above.  The main difference is that you will
see valid exchanges with other endpoints on the gateway or you will see
successful keepalives with the top level gateway MGCP process.  This was
historically caused by CSCsf26617 and similar.  The signature of this
failure is repeated retransmits of the DLCX, RQNT, or CRCX messages from CM
to the gateway while other endpoints are responding.  If this is happening
then the gateway is having an internal error such as resource allocation or
dsp hang.

HTH.

/Wes



On Tuesday, November 03, 2009 3:39:24 AM, Wilson Hew
<mailto:wilsonhew at gmail.com> <wilsonhew at gmail.com> wrote:



Bob/Ryan, appreciate your feedback. Thanks.

Guess I need to look at the connection between my MGCP gateway and CUCM. Any
idea what else I may need to check? I am looking at the SDI traces, but have
no idea what to look at.

Thanks,
Wil

On Tue, Nov 3, 2009 at 2:35 AM, Bob Fronk <bob at btrfronk.com> wrote:

I had this happening and found out it was an MPLS circuit going down.  Due
to location of this particular site, our 12mbps MPLS circuit is supplied by
multiple T1s bonded with MLPPP.

 

One of the T1s was going up/down several times a day (telco problem) and
each time, the MLPPP would reset for a couple seconds.   The MGCP gateway
responded by going into SRST and the PRI would go down for a moment.

 

Just something to check

 

From: cisco-voip-bounces at puck.nether.net
[mailto:cisco-voip-bounces at puck.nether.net] On Behalf Of Wilson Hew
Sent: Monday, November 02, 2009 11:47 AM
To: cisco-voip at puck.nether.net
Subject: [cisco-voip] MGCP - Fallback to SRST very often even though
connectivity to CUCM is fine

 

Hello there,

Greetings. I am having problem with my MGCP gateway, and I need little help
and advice. My MGCP gateway is running as SRST, and it will fallback to SRST
very often (twice a day). And it will go back to normal operation from
fallback just after that. The connectivity from my MGCP gateway (remote
site) to CUCM is fine.

I noticed my E1 is going down everytime when it falls back to SRST - is it
considered normal?

My gateway is running 12.4(24)T1 and CUCM version 7.0.2.

In 'sh ccm-manager', I have the below:
--------------------------------------------------------------
TFTP retry count to shut Ports: 2

Statistics:
    Packets recvd:   857
    Recv failures:   1
    Packets xmitted: 852
    Xmit failures:   0
--------------------------------------------------------------
In 'sh mgcp stats':

 UDP pkts rx 557379, tx 558783
 Unrecognized rx pkts 0, MGCP message parsing errors 0
 Duplicate MGCP ack tx 9, Invalid versions count 0
 CreateConn rx 36256, successful 36249, failed 7
 DeleteConn rx 36274, successful 36101, failed 173
 ModifyConn rx 66178, successful 66126, failed 52
 DeleteConn tx 154, successful 154, failed 0
 NotifyRequest rx 54652, successful 54516, failed 136
 AuditConnection rx 3, successful 3, failed 0
 AuditEndpoint rx 14887, successful 8080, failed 6807
 RestartInProgress tx 6248, successful 6248, failed 0
 Notify tx 342779, successful 342779, failed 0
 ACK tx 201075, NACK tx 7191
 ACK rx 349100, NACK rx 0
 Collisions: Passive 0, Active 0
--------------------------------------------------------------

Can I tell what is wrong with the above? Apart from that, I see numbers of
slips in controllers e1 increasing, and I have network-clock-participate
configured.

Would appreciate if you could give me your feedback about this. Any feedback
is very much appreciated.

Thanks,
Wil

 




  _____  



 
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
  

 




  _____  



 
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
  

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20091104/1dfe263e/attachment.html>


More information about the cisco-voip mailing list