[cisco-voip] Unity Connection 10.5.2 Split Brain Recovery
Thomas LeMay
thomaslemay at comcast.net
Tue Oct 27 18:58:38 EDT 2015
Hi, Ryan,
Thank you very much for this detailed write up!
Tom
From: Ryan Huff [mailto:ryanhuff at outlook.com]
Sent: Tuesday, October 27, 2015 3:32 PM
To: Thomas LeMay; 'Jason Aarons (AM)'; 'Aaron Banks';
cisco-voip at puck.nether.net
Subject: Re: [cisco-voip] Unity Connection 10.5.2 Split Brain Recovery
Tom,
Thin Provisioning:
Here is a nice article from VMware on thin provisioning:
https://blogs.vmware.com/vsphere/2012/03/thin-provisioning-whats-the-scoop.h
tml Essentially, it is a technique that allows the VM to only consumes
storage space needed by the VM, as it needs it up to the size of the storage
disk. With respect to Cisco UC, in many cases (DAS TRC) it is not supported
and where it is allowed (specs based) you must make sure the space is
available at all times (no over-subscription). You can ready more on the
subject at
http://docwiki.cisco.com/wiki/UC_Virtualization_Supported_Hardware
In professional practice I never use thin provisioning for Cisco UC servers.
Unity Connections Split Brain:
Assuming that you HAD a healthy, well functioning Unity Connections cluster;
what typically causes the Split Brain issue is when the primary and HA nodes
are either rebooted/lose power too close to one another and there are
transactions taking place at just the right (rather, wrong) time.
Essentially, both nodes end up with different database states and the
clustering services cannot accurately determine which database (node) should
be primary.
You end up in a state where each node is fighting with the other node in an
epic "king of the hill" battle. Typically, by taking the HA node off line
and rebooting the primary node (and signaling the IVR at least once) the
clustering services are able to restore balance to the cluster, then it is
usually safe to power the HA node back up.
Hope this helps,
-Ryan
_____
From: Thomas LeMay <thomaslemay at comcast.net>
Sent: Tuesday, October 27, 2015 2:33 PM
To: 'Jason Aarons (AM)'; 'Ryan Huff'; 'Aaron Banks';
cisco-voip at puck.nether.net
Subject: RE: [cisco-voip] Unity Connection 10.5.2 Split Brain Recovery
Hi, Aaron,
Can you elaborate on what is meant by thin provisioning? We experienced
split brain a few weeks ago whereby each uc server thought it was the
primary. They became locked and I rebooted the servers. When they came
back up they went into split brain recovery mode.
We applied the 20K ova template.
Tom
From: cisco-voip [mailto:cisco-voip-bounces at puck.nether.net] On Behalf Of
Jason Aarons (AM)
Sent: Tuesday, October 27, 2015 2:17 PM
To: Ryan Huff; Aaron Banks; cisco-voip at puck.nether.net
Subject: Re: [cisco-voip] Unity Connection 10.5.2 Split Brain Recovery
Only time I've seen split brain in UC was when the customer built the VM
with thin provisioning. Seems their vm team did all sorts of things wrong..
From: cisco-voip [ <mailto:cisco-voip-bounces at puck.nether.net>
mailto:cisco-voip-bounces at puck.nether.net] On Behalf Of Ryan Huff
Sent: Tuesday, October 27, 2015 1:26 PM
To: Aaron Banks < <mailto:amichaelbanks at hotmail.com>
amichaelbanks at hotmail.com>; <mailto:cisco-voip at puck.nether.net>
cisco-voip at puck.nether.net
Subject: Re: [cisco-voip] Unity Connection 10.5.2 Split Brain Recovery
1.) Shut down the HA node.
2.) Reboot the primary node
3.) Once the primary node is up, place a call into voicemail
4.) Power the HA node back on
5.) Once HA is up, verify HA status.
Sent from my T-Mobile 4G LTE Device
-------- Original message --------
From: Aaron Banks
Date:10/27/2015 12:35 PM (GMT-05:00)
To: <mailto:cisco-voip at puck.nether.net> cisco-voip at puck.nether.net
Subject: [cisco-voip] Unity Connection 10.5.2 Split Brain Recovery
Has anyone seen/resolved a split brain recovery in Unity Connection 10.5.2?
The primary and secondary keep swapping back and forth every few minutes. I
can ping and trace to each server. I restarted the primary but that did not
resolve the issue. In the RTMT system logs, the secondary sends an NTP
query to the primary the response is the primary is inaccessible or down.
I'm stumped.
itevomcid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20151027/77e3173d/attachment.html>
More information about the cisco-voip
mailing list