[cisco-voip] Not supported I'm sure..... but what do you think?

Sat Oct 29 17:03:10 EDT 2016

Hey guys. At a Co. I was working at with a megacluster, we took backups of
the production cluster, and restored to a newly installed cluster on a
"dark" net, and I NAT'd all the VMs so they could run and be upgraded
separately and rebooted and tested with a few phones on a private network.
Then I used some VMware Power CLI scripts to cut over each datacenter in a
few seconds... ECATS was involved as well - there is a little bit of a
precedent, though I recognize that we tailored that specifically for the
customer's environment, and it might not work as well in some others.
It should also be noted that the NEW (and subsequently upgraded, and
switched into production) VMs were built using the actual TAC supported
install process, as opposed to being cloned. The fact that the megacluster
had 21 servers in basically equated to "not enough time to roll back before
monday" for a number of reasons, an issue this upgrade & cutover method
alleviated. I think it was an 8.6 to 9 upgrade. Because of some block level
stuff, sine VMware wasn't supported (IIRC) in the starting version, after
the upgrade of the 2nd set (referred to as "new" earlier)  we took another
back up and restored to *another* cluster, a set of newly installed 9.x VMs
in *another* "dark" net.

BE CAREFUL IF YOU DO THIS
I did 5 production clusters this way - Having two to three sets of
identical servers, with identical IPs WILL get confusing, i don't care how
smart you are.
Use login banners saying which cluster this server is in. It's annoying but
apply them to each node so you are forced to see and can confirm you're
logging in to ccmadmin on to cluster 1, 2 or 3. Having the cluster built
before cutover lets you do things like fix an issue with a sub, or rebuild
a node as many times as you want when/if an install fails. It's easy to get
confused and blast a sub on the production cluster by accident at 3 pm on
when you've got enough servers and have to admin one cluster, and build
another, when you have multiple sets of identical things, and applying 100
banners is less annoying than the consequences of  you or your customer
screwing this up.
When you need to restart/zap a UC VM that there are multiples of - require
two people to look at it. Have your team lead or customer say "yes, that's
the right one" before you click that button.
Be very careful with naming things so you can safely script things in a
low-risk fashion (E.g., so you can't reboot the wrong set of VMs while
working on build #2)

-Peter

On Sat, Oct 29, 2016 at 1:28 PM, Ryan Huff <ryanhuff at outlook.com> wrote:

> As it always has, IMO, it comes down to the nature and context of your
> client. For a true 8by5 SMB; large after hours maintenance windows
> generally aren't a problem as long as some sort of treatment is applied to
> the main ingress.
>
>
> However, the PS engagements I generally work with are medical/healthcare,
> specialized government and 24/7 private industries. If I would tell any of
> my clients:
>
>
> "starting at 9PM what we're gonna do is reboot the phone system and watch
> paint dry. In the case of an RU, we're gonna do that twice or thrice. Your
> phones will do a registration dance anywhere from 30 - 240 minutes." -I
> would get escorted out of their building.
>
>
> So as was mentioned earlier; we've all developed our 'ways' of shrinking
> the actual 'cut over window' to a matter of minutes. Those developments are
> rooted in the needs of our individual clients and spurred by the
> capabilities (some say limitations) of the software's upgrade functionality.
>
>
> Ideally, the inactive partition would have an externally addressable
> method where a unique/independent software version could be installed and
> run in tandem and the active partition had a 'publish' feature where when
> activated, would publish the active Informix schema to the inactive
> Informix schema (using a babble-fish style translator to comp the
> differences in schema versions). Then, your cut over is a matter of
> 'publish' and swap the partitions.
>
>
> All of that is possible, but would require a serious re-tool at the RedHat
> level.
>
>
>
> = Ryan =
>
>
>
>
> ------------------------------
> *From:* cisco-voip <cisco-voip-bounces at puck.nether.net> on behalf of
> Erick Bergquist <erickbee at gmail.com>
> *Sent:* Saturday, October 29, 2016 3:31 PM
> *To:* Lelio Fulgenzi
>
> *Cc:* cisco voip
> *Subject:* Re: [cisco-voip] Not supported I'm sure..... but what do you
> think?
>
> Just chiming in on this.
>
> I've done plenty of SU updates using the inactive partition method
> without any issues for most part. I've also done RU updates which are
> fine because updates are planned and done during mtce windows. Most of
> the work I've done is inplace.
>
> Sometimes the mtce window has to be extended due to longer then
> expected time to install the update. This is what is nice about the SU
> update, you can install it ahead of time. If I have a short mtce
> window, I push these a day ahead sometimes just to make sure the SU is
> done installing on all the servers before the reboot / mtce window.
> However, a RU update that needs reboots during the upgrade is
> sometimes harder to get especially if it is a 24x7 shop with minimum
> downtime.
>
> I really haven't had any real issues with any method (SU/RU) except
> one time during switching the servers on different versions where the
> servers were on mismatched versions one subscriber call processing got
> hung up and calls to phones registered on that node were busy. My fix
> was to stop the call manager server and cti manager service
> immediately which solved the problem until that node was switched to
> new version.
>
> The other time I've had issues was with the 10.5.2 SU3 flat version
> and the switch version bug on Unity, but due to way unity works the
> other server was up and running so no real impact to calls.
>
> However, clients are wanting these patches done more and more with no
> impact to calls or very little downtime for 24x7 organizations with
> call centers that operate 24x7. In those upgrade scenarios you need
> this SU/RU update method and patch process to work flawlessly with
> less interruption as possible and hopefully no glitches.
>
>
> On Thu, Oct 27, 2016 at 3:16 PM, Lelio Fulgenzi <lelio at uoguelph.ca> wrote:
> >
> > This is exactly the reason I did my upgrades off line and swapped out the
> > hardware during the maintenance window.
> >
> >
> > That being said, it still took some time to do all the swapping. Just
> > looking at my notes from the last upgrade and it was about 2-3 hours.
> Much
> > of that was due to the amount of time the servers take to shutdown and
> > restart.
> >
> >
> > ---
> > Lelio Fulgenzi, B.A.
> > Senior Analyst, Network Infrastructure
> > Computing and Communications Services (CCS)
> > University of Guelph
> >
> > 519-824-4120 Ext 56354
> > lelio at uoguelph.ca
> > www.uoguelph.ca/ccs
> > Room 037, Animal Science and Nutrition Building
> > Guelph, Ontario, N1G 2W1
> >
> >
> >
> > ________________________________
> > From: cisco-voip <cisco-voip-bounces at puck.nether.net> on behalf of Scott
> > Voll <svoll.voip at gmail.com>
> > Sent: Thursday, October 27, 2016 4:50 PM
> > To: Stephen Welsh; Ryan Ratliff
> > Cc: cisco voip
> >
> > Subject: Re: [cisco-voip] Not supported I'm sure..... but what do you
> think?
> >
> > I'm going to start with, we don't have a complex deployment.
> >
> > 2 CM
> > 1 UC
> > 1 UCCX
> > 1 CER
> > 1 call recording server
> > ~2000 phones over ~8 sites
> >
> > our last upgrade we tried PCD (joke) spent 4 hours on it before just
> doing
> > it manually.  Will be very hard pressed to every use PCD again.
> >
> > Then it was an additional 12-16 hours to upgrade.  This was just a 8 to
> 10
> > upgrade.
> >
> > We don't have that kind of time.  and personally, I like my personal
> time a
> > lot.  so the more I can do during the week leading up to the switch and
> as
> > small as I can make the switch, is what I'm looking for.
> >
> > Scott
> >
> >
> > On Thu, Oct 27, 2016 at 1:35 PM, Stephen Welsh <
> stephen.welsh at unifiedfx.com>
> > wrote:
> >>
> >> I’ve not done CUCM project work in quite a while, so may be completely
> >> off, but what about making this scenario supportable:
> >>
> >> Complex cluster say, 1 Pub, 6 Sub, 2 TFTP
> >>
> >> Install new software to inactive partition on all nodes, once complete
> >> reboot part of the cluster:
> >>
> >> 1 Pub - new version
> >> 3 Sub - new version (primary subs)
> >> 1 TFTP - new version (primary TFTP)
> >> 3 Sub - old version (secondary subs)
> >> 1 TFTP - old version (secondary TFTP)
> >>
> >> Phone registers to upgraded primary subs, once everything
> >> working/stable/tested, flip remaining (secondary nodes)
> >>
> >> Maybe too complex for this split version to be workable, or not really
> >> much different than flipping all nodes, but may allow the phones to stay
> >> online with minimal disruption as long as all external elements strictly
> >> follow the primary/secondary node configuration.
> >>
> >> Thanks
> >>
> >> Stephen Welsh
> >> CTO
> >> UnifiedFX
> >>
> >>
> >> On 27 Oct 2016, at 21:23, Ryan Huff <ryanhuff at outlook.com> wrote:
> >>
> >>
> >> You are right Anthony, this is a complex solution to avoid the reboot
> (and
> >> rolling the dice that nothing breaks in the first boot of the new
> version)
> >> in a switch-version however; if that is your goal .... as you state.
> >>
> >> -R
> >>
> >> ________________________________
> >> From: avholloway at gmail.com <avholloway at gmail.com> on behalf of Anthony
> >> Holloway <avholloway+cisco-voip at gmail.com>
> >> Sent: Thursday, October 27, 2016 12:02 PM
> >> To: Ryan Huff
> >> Cc: Matthew Loraditch; Tommy Schlotterer; Scott Voll;
> >> cisco-voip at puck.nether.net
> >> Subject: Re: [cisco-voip] Not supported I'm sure..... but what do you
> >> think?
> >>
> >> If only there was an upgrade process wherein you install the new version
> >> to an inactive partition, and then could switch to the new version when
> >> you're ready.  /sarcasm
> >>
> >> But seriously though, everyone in this thread is essentially coming up
> >> with their own clever way of replicating the promise Cisco failed to
> deliver
> >> on, which is performing your upgrades during production on the inactive
> >> partition and then switching versions in a maintenance window.  If they
> >> would have only held themselves to a higher standard, we wouldn't need
> this
> >> complex of an alternate solution.
> >>
> >> On Tue, Oct 25, 2016 at 2:45 PM, Ryan Huff <ryanhuff at outlook.com>
> wrote:
> >>>
> >>> Matthew is correct, copying is listed as "Supported with Caveats" at:
> >>> http://docwiki.cisco.com/wiki/Unified_Communications_VMware_
> Requirements;
> >>> The caveat being found at
> >>> http://docwiki.cisco.com/wiki/Unified_Communications_VMware_
> Requirements#Copy_Virtual_Machine
> >>>
> >>> The VM needs to be powered down first and the resulting VM will have a
> >>> different MAC address (unless it was originally manually specified); so
> >>> you'll need to rehost the PLM if it is co-res to any VM that you copy.
> >>>
> >>> Where I have seen folks get into trouble with this is where a
> subscriber
> >>> is copied, and the user mistakenly thinks that by changing the IP and
> >>> hostname it becomes unique and can be added to the cluster as a new
> >>> subscriber. I have also seen users make a copy of a publisher and
> change the
> >>> network details of the copy, thinking it makes a unique cluster and
> then
> >>> wonders why things like ILS wont work between the two clusters (and it
> isn't
> >>> just because the cluster IDs are the same).
> >>>
> >>> Having said all of that, I would NEVER do this in production ... maybe
> >>> that is just me being cautious or old school, but that is just me. Even
> >>> without changing network details on the copy, I have seen this cause
> issues
> >>> with Affinity. At the very least, if you travel this path I would make
> sure
> >>> that the copy runs on the same host and even in the same datastore.
> >>>
> >>> === An alternative path ===
> >>>
> >>> Admittedly, this path is longer and there is a little more work involve
> >>> but is the safer path, IMO and is what I would trust for a production
> >>> scenario.
> >>>
> >>> 1.) Create a private port group on the host. If the cluster is on
> >>> multiple hosts, span the port group through a connecting network to the
> >>> other hosts but DO NOT create an SVI anywhere in the the topology for
> that
> >>> DOT1Q tag (remembering to add a DOT1Q tag on any networking devices
> between
> >>> the two hosts and allowing on any trunks between the two hosts).
> >>>
> >>> 2.) Upload Cisco's CSR1000V to the host. If you're not familiar with
> the
> >>> product it is at the core and unlicensed, a virtual router with three
> >>> interfaces by default. Out of the box, it is more than enough to
> replicate
> >>> DNS/NTP on your private network which is all you'll need. Assign the
> private
> >>> port group to the network adapters and configure DNS and NTP (master
> 2) on
> >>> this virtual router.
> >>>
> >>> 3.) Build out a replica of your production UC cluster on the private
> >>> network.
> >>>
> >>> 4.) Take a DRS of the production UC apps and then put your SFTP server
> on
> >>> the private network and do a DRS restore to the private UC apps.
> >>>
> >>> 5.) Upgrade the private UC apps and switch your port group labels on
> the
> >>> production/private UC apps during a maintenance window.
> >>>
> >>> Thanks,
> >>>
> >>> Ryan
> >>>
> >>>
> >>>
> >>> ________________________________
> >>> From: cisco-voip <cisco-voip-bounces at puck.nether.net> on behalf of
> >>> Matthew Loraditch <MLoraditch at heliontechnologies.com>
> >>> Sent: Tuesday, October 25, 2016 3:01 PM
> >>> To: Tommy Schlotterer; Scott Voll; cisco-voip at puck.nether.net
> >>>
> >>> Subject: Re: [cisco-voip] Not supported I'm sure..... but what do you
> >>> think?
> >>>
> >>> I can’t see any reason it wouldn’t be supported honestly. Offline
> Cloning
> >>> is allowed for migration/backup purposes. I actually did the NAT thing
> to do
> >>> my BE5k to 6K conversions. Kept both systems online.
> >>>
> >>>
> >>>
> >>> The only thing I can think to be thought of is ITLs, does an upgrade do
> >>> anything that you’d have to reset phones to go back to the old servers
> if
> >>> there are issues? I don’t think so, but not certain.
> >>>
> >>>
> >>>
> >>> Matthew G. Loraditch – CCNP-Voice, CCNA-R&S, CCDA
> >>> Network Engineer
> >>> Direct Voice: 443.541.1518
> >>>
> >>> Facebook | Twitter | LinkedIn | G+
> >>>
> >>>
> >>>
> >>> From: cisco-voip [mailto:cisco-voip-bounces at puck.nether.net
> <cisco-voip-bounces at puck.nether.net>] On Behalf Of
> >>> Tommy Schlotterer
> >>> Sent: Tuesday, October 25, 2016 2:49 PM
> >>> To: Scott Voll <svoll.voip at gmail.com>; cisco-voip at puck.nether.net
> >>> Subject: Re: [cisco-voip] Not supported I'm sure..... but what do you
> >>> think?
> >>>
> >>>
> >>>
> >>> I do a similar, but supported process. I take DRS backups and then
> >>> restore on servers in a sandbox VLAN. Works well. Make sure you check
> your
> >>> phone firmware and upgrade to the current version before the cutover
> or all
> >>> your phones will have to upgrade on cutover.
> >>>
> >>>
> >>>
> >>> Also make sure you don’t change Hostname/Ip addresses in the sandbox as
> >>> that will cause your ITL to regenerate and cause issues with phone
> >>> configuration changes after cutover.
> >>>
> >>>
> >>>
> >>> Thanks
> >>>
> >>> Tommy
> >>>
> >>>
> >>>
> >>> Tommy Schlotterer | Systems Engineer
> >>> Presidio | www.presidio.com
> >>> 20 N. Saint Clair, 3rd Floor, Toledo, OH 43604
> >>> D: 419.214.1415 | C: 419.706.0259 | tschlotterer at presidio.com
> >>>
> >>>
> >>>
> >>> From: cisco-voip [mailto:cisco-voip-bounces at puck.nether.net
> <cisco-voip-bounces at puck.nether.net>] On Behalf Of
> >>> Scott Voll
> >>> Sent: Tuesday, October 25, 2016 2:43 PM
> >>> To: cisco-voip at puck.nether.net
> >>> Subject: [cisco-voip] Not supported I'm sure..... but what do you
> think?
> >>>
> >>>
> >>>
> >>> So my co-worker and I are thinking about upgrades.  we are currently on
> >>> 10.5 train and thinking about the 11.5 train.
> >>>
> >>>
> >>>
> >>> What would be your thoughts about taking a clone of every VM.  CM, UC,
> >>> UCCx, CER, PLM,
> >>>
> >>>
> >>>
> >>> placing it on another vlan with the same IP's.  NAT it as it goes onto
> >>> your network so it has access to NTP, DNS, AD, etc.
> >>>
> >>>
> >>>
> >>> do your upgrade on the clones.
> >>>
> >>>
> >>>
> >>> Then in VM ware shut down the originals,and change the Vlan (on the
> >>> clones)  back to the production vlan for your voice cluster.
> >>>
> >>>
> >>>
> >>> it would be like a telco slash cut.  10 minute outage as you move from
> >>> one version to the other.
> >>>
> >>>
> >>>
> >>> Thoughts?
> >>>
> >>>
> >>>
> >>> Scott
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> This message w/attachments (message) is intended solely for the use of
> >>> the intended recipient(s) and may contain information that is
> privileged,
> >>> confidential or proprietary. If you are not an intended recipient,
> please
> >>> notify the sender, and then please delete and destroy all copies and
> >>> attachments. Please be advised that any review or dissemination of, or
> the
> >>> taking of any action in reliance on, the information contained in or
> >>> attached to this message is prohibited.
> >>>
> >>> _______________________________________________
> >>> cisco-voip mailing list
> >>> cisco-voip at puck.nether.net
> >>> https://puck.nether.net/mailman/listinfo/cisco-voip
> >>>
> >>
> >> _______________________________________________
> >> cisco-voip mailing list
> >> cisco-voip at puck.nether.net
> >> https://puck.nether.net/mailman/listinfo/cisco-voip
> >>
> >>
> >>
> >> _______________________________________________
> >> cisco-voip mailing list
> >> cisco-voip at puck.nether.net
> >> https://puck.nether.net/mailman/listinfo/cisco-voip
> >>
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip at puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-voip
> >
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20161029/fe92ed4e/attachment.html>