[cisco-voip] New to me RTMT Alert: CoreDumpFileFound

Ryan Ratliff rratliff at cisco.com
Wed Aug 19 13:16:29 EDT 2009


Your core was the ccm service.  The SDL OOS event was from the other  
two nodes reporting they lost connection to the node that crashed.   
The media resource events were those devices re-homing from the server  
where the crash occurred to their backup server.

If this is the first time you've seen a crash then you need to decide  
if you want to get root cause on it or not.

If you do want to pursue finding out what caused the crash then the  
first thing you need to do is collect the CCM traces from all nodes  
that cover the 15 minutes leading up to the crash.   Keep these zipped  
up somewhere safe in case they are needed.   Also get the event viewer  
(system and application) from the server where the crash occurred.   
You'll also want the RisDC perfmon files from the day of the crash.

You can analyze the core yourself and if a bug search doesn't turn  
anything up open a TAC SR and provide the dump analysis along with the  
files listed above.
To analyze the core first do 'utils core list' and use that output in  
a 'utils core analyze' from the CLI of the server.  If that server is  
a primary for phone registration then I'd advise waiting until after  
hours to do the analyze.  If it's a backup and not heavily utilized  
then you should be safe.

-Ryan

On Aug 19, 2009, at 1:02 PM, Jeff Ruttman wrote:

Greetings,

I received the coredumpfilefound message  below closely followed by 2  
of these:
SDLLinkOutOfService event generated. Current outstanding sdl oos  
alarms: SDLLinkOOS LocalNodeId : 3 LocalApplicationID : 100  
RemoteIPAddress : 10.10.3.51 RemoteNodeID : 4 RemoteApplicationID :  
100 LinkID : 3:100:4:100 NodeID : ma3-ccm02 TimeStamp : Wed Aug 19  
11:11:12 CDT 2009 The alert is generated on Wed Aug 19 11:11:41 CDT  
2009 on node 10.14.3.50.

Followed by ResisteredMediaDevices decrease and increase RTMT messages.

Can anyone tell me what I'm seeing here?  Something like this? (Don't  
laugh...I'm trying I'm trying! :))

The Cisco Log Partition Monitoring Tool on node dr-ccm03 had some sort  
of problem causing the dumpfile to be generated.  The problem also  
caused this SDLLink out of service--some sort of connectivity problem  
between the problem node and our Pub and other Sub.  Then the media  
devices increase/decrease messages suggest that the system has  
recovered from the initial problem?

Maybe the only worthwhile questions are:  Should I be worried about  
this message and what might I do about it?

Thanks
jeff

From: RTMT_Admin at ec2802.elderc.org [mailto:RTMT_Admin at ec2802.elderc.org]
Sent: Wednesday, August 19, 2009 11:12 AM
To: Jeff
Subject: [RTMT-ALERT-StandAloneCluster] CoreDumpFileFound

CoreDumpFileFound TotalCoresFound : 1 CoreDetails : The following  
lists up to 6 cores dumped by corresponding applications. Core1 :  
Cisco CallManager (core.10667.6.ccm.1250698260) AppID : Cisco Log  
Partition Monitoring Tool ClusterID : NodeID : dr-ccm03 . The alarm is  
generated on Wed Aug 19 11:11:11 CDT 2009.

CONFIDENTIALITY NOTICE: The information contained in this email  
including attachments is intended for the specific delivery to and use  
by the individual(s) to whom it is addressed, and includes information  
which should be considered as private and confidential. Any review,  
retransmission, dissemination, or taking of any action in reliance  
upon this information by anyone other than the intended recipient is  
prohibited. If you have received this message in error, please reply  
to the sender immediately and delete the original message and any copy  
of it from your computer system. Thank you.
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip



More information about the cisco-voip mailing list