[cisco-voip] Heartbeat Failure & SNRD

Daniel Pagan dpagan at fidelus.com
Wed May 21 10:38:42 EDT 2014


To be more specific, SDL traces show MMManInit sending a DeviceInitStart to DeviceManager, and then, immediately after, DeviceManager creating SNRD. All process creation and startup signals die at this point. This is observable and reproducible in two different environments running the same ES 8.6.2.24122-1.

- Daniel

From: Daniel Pagan
Sent: Wednesday, May 21, 2014 9:35 AM
To: cisco-voip at puck.nether.net
Subject: Heartbeat Failure & SNRD

Folks:

CUCM ES 8.6.2.24122-1 appears to be creating an issue where CallManager heartbeat fails to increment upon startup and the condition that must be met is very specific. On a problematic node, SDL traces show the following error exactly one hour after the start of the CCM service:

AppError  ||||||Local send blocked: SignalName: Start, DestPID: SNRD[1:100:61:1]

This error is followed by the SDL trace printing an error stating CallManager exceeded the permitted time for initialization and will restart the application. The CCM application restarts and additional SDL traces are printed showing the standard creation of critical processes - one hour later the same "Local send blocked" error is printed regarding the SNRD process.

I saw the DestPID: SNRD error, went to a completely different, non-problematic lab environment where 8.6.2.24122-1 is installed, created a single Remote Destination Profile, and then restarted the standalone node in order to force the creation of SNRD. CallManager heartbeats are now failing to increment in that environment and found another "Local send blocked" error regarding SNRD. Removing the single Remote Destination Profile from the standalone environment and rebooting the node resolves the problem. Re-inserting it again followed by a reboot recreates it, making SNRD the obvious culprit here.

I currently have a TAC case open where they're attempting to recreate the problem. It seems no public facing defects are created for this. Just wanted to give you folks a heads up.

Related to this, can someone tell me if this document, specifally the section describing MMManInit and process creation, is still accurate? If so, then what I fail to see in SDL traces is a InitDone signal from SNRD to MMManInit during the 60 minutes between CCM startup and initialization timeout.

- Daniel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20140521/b0f879ec/attachment.html>


More information about the cisco-voip mailing list