[VoiceOps] Vitelity Inbound SMS Outage March 27, 2022 0150 to 1720 UTC

Peter Beckman beckman at angryox.com
Sun Mar 27 19:08:01 EDT 2022


Vitelity hasn't posted anything about this, so I will for those that care.

Short Version: Inbound SMS to all Vitelity DIDs failed between 0150 UTC and
1720 UTC on March 27, 2022

Long Version:

At 0150 UTC our Inbound SMS Monitoring detected that one of our DIDs did
not receive a sent SMS.

This continued until 0230 UTC where our system automatically cut a ticket
(via email) to Vitelity that we were seeing 30+ minutes of zero inbound SMS
messages to a random sample of our DIDs.

At 0305 UTC I opened an emergency ticket about the issue.

At 0341 UTC The ticket was responded to.

At 0503 UTC I was able to talk to someone at Vitelity who confirmed that
they also had sent SMS messages to their inbound DIDs and the messages were
not received, and they were escalating. During that, I also provided DLRs
for all the sent messages, showing that they were received by Onvoy,
Iristel, Neutral Tandem, etc.

Vitelity confirmed that they were esclating and "sounding the alarm."

At 0810 UTC Vitelity updated the ticket and said they were still
escalating. No change, 100% of SMS messages sent to Vitelity DIDs failed to
reach Vitelity or us.

At 1634 UTC I re-re-re-escalated the issue to the Inteliquent NOC and my
Vitelity contact. I was informed that a tech had found that "a certificate
had expired" and was causing all inbound SMS delivery to fail.

Incidentally, this ALSO prevented an emails to Vitelity Support to ALSO not
be delivered to the portal, and thus not open tickets.

At 1720 UTC a test finally succeeded and we received our first SMS in over
15h 30m. Several tests after that across another sample of Vitelity DIDs
were successful.

At no point was status.vitelity.com actually updated with any sort of
incident, and despite it being acknowledged to me, has still not posted any
sign of the outage.


Why I'm Posting

This is the 48th SMS outage since August 2015 that Vitelity has
experienced, and in most cases, I have been the first person to alert them
to a problem. They do not monitor inbound SMS, they just wait for a
customer complaint, and even then, I've been told "nobody else has opened a
ticket," as if multiple customers need to complain before they take an
outage seriously.

In this case, it took 2 hours from opening an emergency ticket for Vitelity
to actually test inbound SMS and confirm an issue, followed by another 12
hours to actually find the root cause and fix it.

Beckman
---------------------------------------------------------------------------
Peter Beckman                                                  Internet Guy
beckman at angryox.com                                https://www.angryox.com/
---------------------------------------------------------------------------


More information about the VoiceOps mailing list