[cisco-voip] MGCP - Fallback to SRST very often even though connectivity to CUCM is fine
Wes Sisk
wsisk at cisco.com
Tue Nov 3 11:56:46 EST 2009
AUEP 13561613 AALN/S2/0 at HQ-VG224-1stFlr MGCP 0.1
F: X
This AUEP is not 'normal'.
F = RequestedInfo
X = RequestIdentifier
Normal AUEP requests much more information.
CM only sends this type of AUEP after 2 missed keepalives from the MGCP
gateway.
/Wes
On Tuesday, November 03, 2009 11:47:16 AM, Wilson Hew
<wilsonhew at gmail.com> wrote:
> Hello Wes,
>
> Thank you so much for the information. It really benefits me!
>
> Btw, when you say the below AUEP is not 'normal', can you please help
> to elaborate?
>
> ----------------------------------------------------------------------------------
> AUEP 76267 AALN/S2/SU0/0 at MLP-VG-01 MGCP 0.1
> F: X
> |<CLID::StandAloneCluster><NID::X.X.X.X><CT::1,100,132,1.204039><IP::X.X.X.X><DEV::><LVL::Significant><MASK::2000>
> ----------------------------------------------------------------------------------
>
> I found that in my SDI traces and I can see AUEP ACK received.
> However, I got a shocked when I see this (more than 10 msgs received
> within second, together):
>
> ----------------------------------------------------------------------------------
> NTFY 129359098 aaln/S2/SU0/3 at MLP-VG-01 MGCP 0.1
> N: ca at 172.22.7.1:2427 <http://ca@172.22.7.1:2427>
> X: 69
> O: L/hd
> |<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204114><IP::172.23.8.251><DEV::><LVL::Significant><MASK::2000>
> 11/03/2009 15:25:38.443
> CCM|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204114><MN::MGCPEndPoint><MV::aaln/S2/SU0/3 at MLP-VG-01><DEV::><LVL::All><MASK::ffff>
> 11/03/2009 15:25:38.443 CCM|MGCPHandler received msg from: 172.23.8.251
>
> NTFY 129359097 *@MLP-VG-01 MGCP 0.1
> X: 0
> O:
> |<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204115><IP::172.23.8.251><DEV::><LVL::Significant><MASK::2000>
> 11/03/2009 15:25:38.443
> CCM|<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204115><MN::MGCPEndPoint><MV::*@MLP-VG-01><DEV::><LVL::All><MASK::ffff>
> 11/03/2009 15:25:38.443 CCM|MGCPHandler received msg from: 172.23.8.251
> ----------------------------------------------------------------------------------
>
> Followed by this (seeing phones keep alive timeout):
>
> ----------------------------------------------------------------------------------
> 11/03/2009 15:25:38.445 CCM|StationInit: TCPPid=[ 1.100.9.210] Keep
> alive timeout.|<CLID::StandAloneCluster><NID::
>
> and the below (is the below trying to tell MGCP gateway restarting?):
>
> 11/03/2009 15:25:38.485 CCM|MGCPInit - //// RSIP <restart> from
> *@MLP-VG-01|<CLID::StandAloneCluster><NID::
>
> 11/03/2009 15:25:38.490 CCM|MGCPManager received DUPLICATE message
> with TransId: 129359097|<CLID::StandAloneCluster><NID::
> ----------------------------------------------------------------------------------
>
> Lastly, CUCM is sending messages to MGCP gateway (more than 10 msgs
> received within second, together):
>
> ----------------------------------------------------------------------------------
> |<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204110><IP::172.23.8.251><DEV::><LVL::Significant><MASK::2000>
> 11/03/2009 15:25:38.756 CCM|MGCPHandler send msg SUCCESSFULLY to:
> 172.23.8.251
> 200 129359097
>
> |<CLID::StandAloneCluster><NID::172.22.7.1><CT::1,100,132,1.204111><IP::172.23.8.251><DEV::><LVL::Significant><MASK::2000>
> 11/03/2009 15:25:38.756 CCM|MGCPHandler send msg SUCCESSFULLY to:
> 172.23.8.251
> 200 129359097
> ----------------------------------------------------------------------------------
>
> Looks to me the network is not stable or not consistent during that
> period.
>
> Any feedback from the gurus out there is very much appreciated!
>
> Thanks,
> Wil
>
>
> On Tue, Nov 3, 2009 at 10:52 PM, Wes Sisk <wsisk at cisco.com
> <mailto:wsisk at cisco.com>> wrote:
>
> blah! I forgot to mention the service parameter. CM Service
> parameter "MGCP Retry Timeout Handling" configures the behavior
> when a timeout is observed. This allows marking the endpoint oos,
> resetting just the port, unregistering the entire gateway.
> Unregistering then entire gateway is the default value.
>
> /wes
>
>
> On Tuesday, November 03, 2009 9:29:24 AM, Wes Sisk
> <wsisk at cisco.com> <mailto:wsisk at cisco.com> wrote:
>> timely question.
>>
>> MGCP gateway can be viewed as:
>>
>> MGCP Gateway
>> mgcp/udp based registration and keepalives
>> analog endpoints
>> mgcp/udp based registration and transactions
>> digital endpoints
>> backhaul/tcp based
>>
>> on CM if you see the alarm:
>> MGCPGatewayLostComm then the top level mgcp process stopped
>> communicating with CM. Usually the GW sends keepalives to CM
>> similar to:
>> 12/27/2005 10:16:40.173 CCM|MGCPHandler received msg from:
>> 10.10.33.250
>> NTFY 333382 *@HQ-VG224-3rdFlr MGCP 0.1
>> X: 0
>> O:
>> |<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017474><IP::10.10.33.250><DEV::>
>>
>>
>> If CM does not receive keepalive from gateway CM will attempt to
>> query the gw with this message:
>> 12/27/2005 10:17:07.002 CCM|MGCPHandler send msg SUCCESSFULLY to:
>> 10.10.31.250
>> AUEP 13561613 AALN/S2/0 at HQ-VG224-1stFlr MGCP 0.1
>> F: X
>> |<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017448><IP::10.10.31.250><DEV::>
>>
>>
>> This AUEP is not 'normal'.
>> F = RequestedInfo
>> X = RequestIdentifier
>> Normal AUEP requests much more information. This is a special
>> "hello, are you there" type exchange.
>>
>> the gateway should respond:
>> 12/27/2005 10:17:07.002 CCM|MGCPHandler received msg from:
>> 10.10.31.250
>> 200 13561613
>> X: 2
>> |<CLID::MFCU-CM-1-Cluster><NID::10.10.200.11><CT::2,100,66,1.23017648><IP::10.10.31.250><DEV::>
>>
>>
>>
>> This is getting very close to unregistration. Another way to
>> look at this is to look for indicates of lost messages to the
>> gateway. Each MGCP transaction is retransmitted up to 3 times if
>> not ack'd. You can see retries in the CM SDI traces:
>> 01/13/2005 10:34:33.603 CCM|MGCPHandler TransId: 1097943 Timeout.
>> Retry#1
>>
>> If you see frequent retries then you are intermittently dropping
>> or excessively delaying the UDP packets carrying the MGCP payload.
>>
>>
>> There is also an issue where endpoints may stop responding to
>> CM. CM will retry the transaction 3 times and then unregister
>> the gateway. This looks similar to the retries tracked above.
>> The main difference is that you will see valid exchanges with
>> other endpoints on the gateway or you will see successful
>> keepalives with the top level gateway MGCP process. This was
>> historically caused by CSCsf26617 and similar. The signature of
>> this failure is repeated retransmits of the DLCX, RQNT, or CRCX
>> messages from CM to the gateway while other endpoints are
>> responding. If this is happening then the gateway is having an
>> internal error such as resource allocation or dsp hang.
>>
>> HTH.
>>
>> /Wes
>>
>>
>>
>> On Tuesday, November 03, 2009 3:39:24 AM, Wilson Hew
>> <wilsonhew at gmail.com> <mailto:wilsonhew at gmail.com> wrote:
>>> Bob/Ryan, appreciate your feedback. Thanks.
>>>
>>> Guess I need to look at the connection between my MGCP gateway
>>> and CUCM. Any idea what else I may need to check? I am looking
>>> at the SDI traces, but have no idea what to look at.
>>>
>>> Thanks,
>>> Wil
>>>
>>> On Tue, Nov 3, 2009 at 2:35 AM, Bob Fronk <bob at btrfronk.com
>>> <mailto:bob at btrfronk.com>> wrote:
>>>
>>> I had this happening and found out it was an MPLS circuit
>>> going down. Due to location of this particular site, our
>>> 12mbps MPLS circuit is supplied by multiple T1s bonded with
>>> MLPPP.
>>>
>>>
>>>
>>> One of the T1s was going up/down several times a day (telco
>>> problem) and each time, the MLPPP would reset for a couple
>>> seconds. The MGCP gateway responded by going into SRST and
>>> the PRI would go down for a moment.
>>>
>>>
>>>
>>> Just something to check
>>>
>>>
>>>
>>> *From:* cisco-voip-bounces at puck.nether.net
>>> <mailto:cisco-voip-bounces at puck.nether.net>
>>> [mailto:cisco-voip-bounces at puck.nether.net
>>> <mailto:cisco-voip-bounces at puck.nether.net>] *On Behalf Of
>>> *Wilson Hew
>>> *Sent:* Monday, November 02, 2009 11:47 AM
>>> *To:* cisco-voip at puck.nether.net
>>> <mailto:cisco-voip at puck.nether.net>
>>> *Subject:* [cisco-voip] MGCP - Fallback to SRST very often
>>> even though connectivity to CUCM is fine
>>>
>>>
>>>
>>> Hello there,
>>>
>>> Greetings. I am having problem with my MGCP gateway, and I
>>> need little help and advice. My MGCP gateway is running as
>>> SRST, and it will fallback to SRST very often (twice a day).
>>> And it will go back to normal operation from fallback just
>>> after that. The connectivity from my MGCP gateway (remote
>>> site) to CUCM is fine.
>>>
>>> I noticed my E1 is going down everytime when it falls back
>>> to SRST - is it considered normal?
>>>
>>> My gateway is running 12.4(24)T1 and CUCM version 7.0.2.
>>>
>>> In 'sh ccm-manager', I have the below:
>>> --------------------------------------------------------------
>>> TFTP retry count to shut Ports: 2
>>>
>>> Statistics:
>>> Packets recvd: 857
>>> Recv failures: 1
>>> Packets xmitted: 852
>>> Xmit failures: 0
>>> --------------------------------------------------------------
>>> In 'sh mgcp stats':
>>>
>>> UDP pkts rx 557379, tx 558783
>>> Unrecognized rx pkts 0, MGCP message parsing errors 0
>>> Duplicate MGCP ack tx 9, Invalid versions count 0
>>> CreateConn rx 36256, successful 36249, failed 7
>>> DeleteConn rx 36274, successful 36101, failed 173
>>> ModifyConn rx 66178, successful 66126, failed 52
>>> DeleteConn tx 154, successful 154, failed 0
>>> NotifyRequest rx 54652, successful 54516, failed 136
>>> AuditConnection rx 3, successful 3, failed 0
>>> AuditEndpoint rx 14887, successful 8080, failed 6807
>>> RestartInProgress tx 6248, successful 6248, failed 0
>>> Notify tx 342779, successful 342779, failed 0
>>> ACK tx 201075, NACK tx 7191
>>> ACK rx 349100, NACK rx 0
>>> Collisions: Passive 0, Active 0
>>> --------------------------------------------------------------
>>>
>>> Can I tell what is wrong with the above? Apart from that, I
>>> see numbers of slips in controllers e1 increasing, and I
>>> have network-clock-participate configured.
>>>
>>> Would appreciate if you could give me your feedback about
>>> this. Any feedback is very much appreciated.
>>>
>>> Thanks,
>>> Wil
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip at puck.nether.net <mailto:cisco-voip at puck.nether.net>
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip at puck.nether.net <mailto:cisco-voip at puck.nether.net>
>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20091103/a572abbb/attachment.html>
More information about the cisco-voip
mailing list