[cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

Kent Roberts dvxkid at gmail.com
Tue Jun 21 21:56:18 EDT 2022


14 should be esx7 compatible.    And it’s tanberg adapted Cisco 
So. There’s that

Kent

> On Jun 21, 2022, at 18:44, Lelio Fulgenzi <lelio at uoguelph.ca> wrote:
> 
> 
> Expressway is (one of) the only Collab products that doesn’t follow the “what we list is the minimum version of ESXi, maintenance releases and updates are good to go” rule.  
> 
> Which is fine, but then they also don’t make an effort to test subsequent ESXi updates either. 
> 
> They currently only support ESXi 6.5U2 which means I had to stick to the patch just before U3 in order for the version information to show U2.  
> 
> And this is where the conversation skewed to “you should have expressway on their own ESXi boxes”
> 
> Ugh.  
> 
> Sent from my iPhone
> 
>> On Jun 21, 2022, at 7:51 PM, Kent Roberts <dvxkid at gmail.com> wrote:
>> 
>> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp at uoguelph.ca
>> 
>> 
>> Expressway is a very temperamental child.  You sneeze and it will act up.  We are in design reviews again with a Cisco expressway expert and that’s all he does.   They want things set their way and the are watching the entire build.    
>> 
>> The redundancy has its own set of problems ultimately we are now deploying 6 boxes in data center 1 and 6 boxes and data center 2 as   entirely different clusters as traffic across the wan was causing its own  set of replication problems with Expressway.    Bottom line is Cisco got the product and it’s been added on so many times it has its own set of issues.    
>> We have multiple 10 gig links between the centers. And multiple 10 gig links to the internet.    Low voice traffic on expressway but have faced lots of fun over the last 2 years
>> 
>>>> On Jun 21, 2022, at 14:55, Matthew Huff <mhuff at ox.com> wrote:
>>> 
>>> Yes, they want 6 boxes for redundancy. Two for expressway-e, two for CUCM, two for expressway-c. /Boggle
>>> 
>>> Even then, that doesn't provide 100% redundancy. We want to place our CUCM and expressways at different datacenters connected by a 10GB wan. If we were to loose the WAN, we would still fail with MRA since we would lose both ESXi hosts.
>>> 
>>> 
>>> -----Original Message-----
>>> From: Lelio Fulgenzi <lelio at uoguelph.ca> 
>>> Sent: Tuesday, June 21, 2022 4:30 PM
>>> To: Matthew Huff <mhuff at ox.com>; Adam Pawlowski <ajp26 at buffalo.edu>; cisco-voip voyp list <cisco-voip at puck.nether.net>
>>> Subject: RE: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>> 
>>> I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment. 
>>> 
>>> I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs. 
>>> 
>>> And I was all, "wait, what?" 
>>> 
>>> They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy? 
>>> 
>>> What a let down. ☹
>>> 
>>> -----Original Message-----
>>> From: cisco-voip <cisco-voip-bounces at puck.nether.net> On Behalf Of Matthew Huff
>>> Sent: Tuesday, June 21, 2022 1:37 PM
>>> To: Adam Pawlowski <ajp26 at buffalo.edu>; cisco-voip voyp list <cisco-voip at puck.nether.net>
>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>> 
>>> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp at uoguelph.ca
>>> 
>>> 
>>> Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.
>>> 
>>> It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: cisco-voip <cisco-voip-bounces at puck.nether.net> On Behalf Of Adam Pawlowski
>>> Sent: Tuesday, June 21, 2022 1:04 PM
>>> To: cisco-voip voyp list <cisco-voip at puck.nether.net>
>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>> 
>>> I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?
>>> 
>>> The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery.  I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.
>>> 
>>> Adam Pawlowski
>>> Network Engineer | Network and Communication Services University at Buffalo Information Technology (UBIT)
>>> 243 Computing Center, Buffalo, NY 14260 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: cisco-voip <cisco-voip-bounces at puck.nether.net> On Behalf Of 
>>>> Matthew Huff
>>>> Sent: Tuesday, June 21, 2022 12:54 PM
>>>> To: Hunter Fuller <hf0002 at uah.edu>
>>>> Cc: cisco-voip voyp list <cisco-voip at puck.nether.net>
>>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, 
>>>> Cisco TAC agrees, says it's a documentation defect
>>>> 
>>>> We have no interest in setting up a jabber environment in order to 
>>>> debug ciscos's issue.
>>>> 
>>>> Yes, every expressway-e knows about all expressway-c, all expressway-c 
>>>> know about CUCM. Cisco TAC has verified the configuration, logs, and 
>>>> diagnostic. I've been working with them for 2 months and it's been 
>>>> escalated to backline-engineering. They looked at the Cisco Phone PRT 
>>>> logs and confirmed that it's a known limitation, and there is no solution.
>>>> 
>>>> Maybe it's an issue with later versions of CUCM and/or expressway? We 
>>>> are running the latest including latest phone firmware.
>>>> 
>>>> Failover works great except in one scenario where both the CUCM 
>>>> subscriber and the expressway-c that reside on the same machine are both shut down.
>>>> Brining either one up, and the phone registers.
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Hunter Fuller <hf0002 at uah.edu>
>>>> Sent: Tuesday, June 21, 2022 12:41 PM
>>>> To: Matthew Huff <mhuff at ox.com>
>>>> Cc: Kent Roberts <dvxkid at gmail.com>; cisco-voip voyp list <cisco- 
>>>> voip at puck.nether.net>
>>>> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work, 
>>>> Cisco TAC agrees, says it's a documentation defect
>>>> 
>>>> It might be worth setting up a Jabber test endpoint just to see.
>>>> 
>>>> Some questions though:
>>>> - Does every Expressway-E know about every Expressway-C?
>>>> - Does every Expressway-C know about every CUCM?
>>>> 
>>>> I'm trying to figure out what the desired architecture is, and/or how 
>>>> this problem would happen.
>>>> In our environment, the above are both true. So the loss of any number 
>>>> of anything, should not result in failover issues - and that is the 
>>>> behavior we have seen (we have shut down entire sites due to 
>>>> maintenance, power failure, etc. and failover worked).
>>>> In fact, we have found MRA phones to be great at failover in this way 
>>>> (our MRA phones are all 8851s). Jabber has been the problem child.
>>>> 
>>>> --
>>>> Hunter Fuller (they)
>>>> Router Jockey
>>>> VBH M-1C
>>>> +1 256 824 5331
>>>> 
>>>> Office of Information Technology
>>>> The University of Alabama in Huntsville Network Engineering
>>>> 
>>>>>> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff at ox.com> wrote:
>>>>>> 
>>>>>> We don’t use Jabber nor Webex.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Cisco TAC has been escalated and they have been working on this for 
>>>>>> over
>>>> 2 months. I have sent repeated expressway and PRT logs from the phone.
>>>> After working with Cisco engineering, the claim it is “working as intended”
>>>> and plan on updating the documentation to reflect the limitation that 
>>>> if you loose both the subscriber and redundant expressway-C server, 
>>>> failover won’t happen.
>>>>> 
>>>>> 
>>>>> 
>>>>> I’d love to be proven wrong since we may have to completely replace 
>>>>> our
>>>> solution.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> From: Kent Roberts <dvxkid at gmail.com>
>>>>> Sent: Tuesday, June 21, 2022 10:09 AM
>>>>> To: Matthew Huff <mhuff at ox.com>
>>>>> Cc: cisco-voip voyp list <cisco-voip at puck.nether.net>
>>>>> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC 
>>>>> agrees, says it's a documentation defect
>>>>> 
>>>>> 
>>>>> 
>>>>> This sound more like a config issue…
>>>>> 
>>>>> 
>>>>> 
>>>>> Have run into issues where expressways go stupid when boxes go 
>>>>> offline
>>>>> 
>>>>> As for it being the phones 88xx. Does the same happen with jabber or
>>>> webex?    If it does i’d  requeue the case….
>>>>> 
>>>>> 
>>>>> 
>>>>> Kent
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff at ox.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> We have a fairly common and standard deployment for our MRA solution.
>>>>> All are running CUCM 14+, latest Expressway, etc…
>>>>> 
>>>>> 
>>>>> 
>>>>> Vmware server 1 (jn DMZ)
>>>>> 
>>>>> ExpressWay-E-1
>>>>> 
>>>>> 
>>>>> 
>>>>> Vmware server 2 (in DMZ)
>>>>> 
>>>>> ExpressWay-E-2
>>>>> 
>>>>> 
>>>>> 
>>>>> Vmware Server 3 (In Core)
>>>>> 
>>>>> CUCM Publisher
>>>>> 
>>>>> Expressway-C-1
>>>>> 
>>>>> 
>>>>> 
>>>>> VMWare Server 4( In Core)
>>>>> 
>>>>> CUCM Subscriber
>>>>> 
>>>>> Expressway-C-2
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> If ether Expreway-E VMs fail, redundancy works fine If either CUCM 
>>>>> fails, redundancy works fine If either Expressway-C VMs fail, 
>>>>> redundancy works fine If VMWare Server 4 fails (say during patching, 
>>>>> hardware maintenance or hardware failure), redundancy fails. Remote
>>>> phones un-register and never register no matter what is done. If 
>>>> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>>>>> 
>>>>> 
>>>>> 
>>>>> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA 
>>>>> phones
>>>> and is not solvable unless we purchase two new vmware servers and 
>>>> split the CUCM and Expressway-C into separate servers so they both 
>>>> won’t go down at once. Sinc VMWare Server 3 & 4 are at different 
>>>> locations, vMotion isn’t an option since there is no shared storage.
>>>>> 
>>>>> 
>>>>> 
>>>>> Anyone run into this or have any suggestions? We have engaged our 
>>>>> VAR
>>>> and cisco rep and may have to replace our phone system since we are 
>>>> all working from home and MRA support including redundancy is critical to us.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> cisco-voip mailing list
>>>>> cisco-voip at puck.nether.net
>>>>> 
>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>>> voip&data=05%7C01%7Cajp26
>>>>> 
>>>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>>>> 0b199
>>>>> 
>>>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>>>> bGZsb3d8ey
>>>>> 
>>>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>>>> 7C300
>>>>> 
>>>> 0%7C%7C%7C&sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>>>> BI%2FsU%3
>>>>> D&reserved=0
>>>>> 
>>>>> _______________________________________________
>>>>> cisco-voip mailing list
>>>>> cisco-voip at puck.nether.net
>>>>> 
>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>>> voip&data=05%7C01%7Cajp26
>>>>> 
>>>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>>>> 0b199
>>>>> 
>>>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>>>> bGZsb3d8ey
>>>>> 
>>>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>>>> 7C300
>>>>> 
>>>> 0%7C%7C%7C&sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>>>> BI%2FsU%3
>>>>> D&reserved=0
>>>> _______________________________________________
>>>> cisco-voip mailing list
>>>> cisco-voip at puck.nether.net
>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>>> voip&data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
>>>> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
>>>> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>>>> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
>>>> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
>>>> mp;reserved=0
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip


More information about the cisco-voip mailing list