[Outages-discussion] CenturyLink Outages this morning

Keith Medcalf kmedcalf at dessus.com
Thu Dec 27 22:58:27 EST 2018


One wonders why they do not simply "undo" whatever change they made between when things were "working" and when they "broke" ...


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: Outages-discussion [mailto:outages-discussion-
>bounces at outages.org] On Behalf Of Erik Wooding
>Sent: Thursday, 27 December, 2018 20:06
>To: Biddle, Josh
>Cc: outages-discussion at outages.org
>Subject: Re: [Outages-discussion] CenturyLink Outages this morning
>
>They're definitely lost. Last troubleshoot was a card issue in Denver
>which didn't change anything.
>
>These are the updates from the ticket we have with them if anyone
>isn't getting these.
>
>---------------------------------------------------------------------
>---------------------------------------------------
>
>2018-12-28 02:02:14 GMT - Once the card was removed in Denver, CO it
>was confirmed that there was no significant improvement. Additional
>packet captures, and logs will be pulled from the device with the
>card removed to further isolate the root cause. The Equipment vendor
>continues to work with CenturyLink Field Operations at multiple sites
>to remove the secondary communication channel tunnel across the
>network until full visibility can be restored. The equipment vendor
>has identified a number of additional nodes that visibility has been
>restored to, and their engineers are currently working to apply the
>necessary filter to each of the reachable nodes.
>
>
>2018-12-28 01:03:11 GMT - Following the review of the logs and packet
>captures, the Equipment Vendor驴s Tier IV Support team has identified
>a suspected card issue in Denver, CO.驴驴Field Operations has arrived
>on site and are working in cooperation with the Equipment Vendor to
>remove the card.
>
>
>2018-12-28 00:01:53 GMT - The Equipment Vendor is currently reviewing
>the logs and packet captures from devices that have been completed,
>while logs and packet captures continue to be pulled from additional
>devices. The necessary teams continue to remove a secondary
>communication channel tunnel across the network until visibility can
>be restored. All technical teams continue to diligently work to
>review the information obtained in an effort to isolate the root
>cause.
>
>
>2018-12-27 22:58:43 GMT - Multiple teams continue work to pull
>additional logs and packet captures on devices that have had
>visibility restored, which will be scrutinized during root cause
>analysis. The Tier IV Equipment Vendor Technical Support team in
>conjunction with Field Operations are working to remove a secondary
>communication channel tunnel across the network until visibility can
>be restored. The Equipment Vendor Support team has dispatched their
>Field Operations team to the site in Chicago, IL and has been
>obtaining data directly from the equipment.
>
>
>2018-12-27 21:36:58 GMT - It has been advised that visibility has
>been restored to both the Chicago, IL and Atlanta, GA sites.
>Engineering and Tier IV Equipment Vendor Technical Support are
>currently working to obtain additional logs from devices across
>multiple sites including Chicago and Atlanta to further isolate the
>root cause.
>
>
>2018-12-27 20:19:36 GMT - Tier IV Equipment Vendor Technical Support
>continues to work with CenturyLink Field Operations and Engineering
>to restore visibility and apply the filter to devices in Atlanta, GA
>and Chicago, IL. While those efforts are ongoing additional logs have
>been pulled from the devices in Kansas City, MO and New Orleans, LA
>following the restoral of visibility and the necessary filter
>application to obtain additional pertinent information now that the
>device is remotely accessible.
>
>
>2018-12-27 19:30:01 GMT - Spoke with Sean advised we were seeing
>system reestored he verified and advised his other circuit is showing
>restored at the same time as this circuit Related to nationwide
>outage. adding to parent ticket Sean request we leave this ticket
>open until they are handsoff please when you are available could you
>send us your traceroutes thank you
>
>
>2018-12-27 19:17:01 GMT - Efforts to regain visibility to sites in
>Atlanta, GA and Chicago, IL remain ongoing. Once visibility has been
>restored the filter will be applied to limit communication traffic
>between sites which was causing CPU spikes that in turn prevented the
>devices from functioning properly.
>
>
>2018-12-27 18:13:33 GMT - The equipment vendor驴s Tier IV Technical
>Support team continued to investigate the equipment logs to further
>assist with isolation. Field Operations were dispatched to multiple
>sites to investigate equipment in Kansas City, MO, New Orleans, LA,
>as well as Atlanta, GA. It has been advised that a controller card
>has been stabilized in New Orleans, LA restoring visibility to the
>equipment to allow additional investigations to continue. A filter
>was then applied to equipment in Kansas City, MO that further
>alleviated the impact observed. Investigations remain ongoing to
>further restore network services.
>
>
>2018-12-27 15:46:56 GMT - Following the isolation of a node in San
>Antonio, TX that alleviated some of the impact the necessary teams
>have shifted troubleshooting focus to additional nodes experiencing
>issues. A node in Atlanta, GA as well as a site in New Orleans, LA
>are currently being investigated.
>
>
>2018-12-27 14:44:42 GMT - Field Operations dispatched to various
>locations to troubleshoot cooperatively with the equipment vendor.
>During cooperative troubleshooting, a device in San Antonio, TX was
>seeming to broadcast traffic consuming capacity and impacting other
>nodes in the network. The node was isolated from the network, which
>appears to have alleviated some impact; however, troubleshooting
>efforts continue to restore all impacted services. We understand how
>important these services are to our clients and the issue has been
>escalated to the highest levels within CenturyLink Service Assurance
>Leadership.
>
>
>2018-12-27 13:50:05 GMT - On December 27, 2018 at 02:40 GMT,
>CenturyLink identified a service impact in various locations. The NOC
>engaged to begin investigations. The NOC engaged Tier IV Vendor
>Support to assist in troubleshooting and fault isolation efforts. The
>NOC engaged Field Operations to cooperatively troubleshoot. Field
>Operations arrived on site in Denver, CO. Field Operations dispatched
>additional technicians to Kansas City, MO and Omaha, NE. The NOC is
>continuing to cooperatively troubleshoot with Tier IV Vendor Support
>for a site in San Antonio, TX.
>
>
>2018-12-27 13:14:12 GMT - Field Operations has arrived on site at the
>Denver location.
>
>
>2018-12-27 12:49:55 GMT - The NOC has engaged Tier IV Vendor Support
>to assist in troubleshooting efforts.
>
>
>2018-12-27 12:31:34 GMT - Field Operations has dispatched a second
>technician to another site to assist with troubleshooting and fault
>isolation efforts. An ETA of 13:30 GMT has been provided.
>
>
>2018-12-27 12:21:46 GMT - The NOC has engaged Field Operations to
>assist in troubleshooting and fault isolation efforts. An ETA of
>13:15 GMT has been provided.
>
>
> 2018-12-27 11:57:40 GMT - On December 27, 2018 at 02:40 GMT,
>CenturyLink identified a service impact in New Orleans, LA. The NOC
>is engaged and investigating in order to isolate the cause. Please be
>advised that updates for this event will be relayed at a minimum of
>hourly unless otherwise noted. The information conveyed hereafter is
>associated to live troubleshooting effort and as the discovery
>process evolves through to service resolution, ticket closure, or
>post incident review, details may evolve.
>
>
>On Thu, Dec 27, 2018 at 6:59 PM Biddle, Josh <JBiddle at ntst.com>
>wrote:
>
>
>	Thanks.  This is one for the books I guess.  I just got in to
>work a few hours ago so I am playing catch up.  It sounds to me like
>they still don’t know what happened.  In that thread I see multiple
>reports of circuits going into loopback.  I’m guessing this is CL’s
>way of trying to segregate parts of their core infrastructure in
>attempt to kill whatever is broadcasting and reconverge their
>network.  It sounds like they still have not figured out the root
>cause.
>
>
>
>	Anyone have any thoughts or updates?
>
>
>
>
>
>	From: frnkblk at iname.com <frnkblk at iname.com>
>	Sent: Thursday, December 27, 2018 9:10 PM
>	To: Biddle, Josh <JBiddle at ntst.com>
>	Subject: RE: [Outages-discussion] CenturyLink Outages this
>morning
>
>
>
>	This Reddit post is probably the best/closes to what you’re
>asking for:
>https://old.reddit.com/r/networking/comments/a9z6tb/centurylink_outag
>e_west_coast/ecnsbct/
><https://urldefense.proofpoint.com/v2/url?u=https-
>3A__old.reddit.com_r_networking_comments_a9z6tb_centurylink-5Foutage-
>5Fwest-5Fcoast_ecnsbct_&d=DwMFAg&c=-
>7HNwxqfpkdcRXCW8HB54Q&r=svX1Si7sopSBMitBL3bFwQ&m=W4lc_zzFTBgLx4j-
>BCYt1OS6aM_k2UNM7ApSaARhJ08&s=aOKuybu-
>jzAMU3kMtXlyoHlJowOoZk7j8l5nWAnrqoo&e=>
>
>
>
>	Frank
>
>
>
>	From: Outages-discussion <outages-discussion-
>bounces at outages.org> On Behalf Of Biddle, Josh
>	Sent: Thursday, December 27, 2018 6:34 PM
>	To: outages-discussion at outages.org
>	Subject: Re: [Outages-discussion] CenturyLink Outages this
>morning
>
>
>
>	Anyone have root cause of Century Link outage and ETTR?
>
>
>
>
>
>	From: Outages-discussion <outages-discussion-
>bounces at outages.org> On Behalf Of Frank Bulk
>	Sent: Thursday, December 27, 2018 6:42 PM
>	To: outages-discussion at outages.org
>	Subject: Re: [Outages-discussion] CenturyLink Outages this
>morning
>
>
>
>	My IPv6 access to www.qwest.com
><https://urldefense.proofpoint.com/v2/url?u=http-
>3A__www.qwest.com&d=DwMFAg&c=-
>7HNwxqfpkdcRXCW8HB54Q&r=svX1Si7sopSBMitBL3bFwQ&m=W4lc_zzFTBgLx4j-
>BCYt1OS6aM_k2UNM7ApSaARhJ08&s=Dw0K5qO9ViQuAnU-lsg-
>DS_zFzzr3sACarc7Az1k7o8&e=>  and www.centurylink.com
><https://urldefense.proofpoint.com/v2/url?u=http-
>3A__www.centurylink.com&d=DwMFAg&c=-
>7HNwxqfpkdcRXCW8HB54Q&r=svX1Si7sopSBMitBL3bFwQ&m=W4lc_zzFTBgLx4j-
>BCYt1OS6aM_k2UNM7ApSaARhJ08&s=QlmOIb-
>_7Xxxz5CMZT6NxzOo9JvY2bcQsR61xQ91dK0&e=>  has been stable since 3:29
>pm U.S. Central.
>
>
>
>	downdetector.com has flatlined for a while, but still not close
>to zero, and based on what I see in the reddit thread, the issue is
>not resolved.
>
>
>
>	Frank
>
>
>
>	From: Outages-discussion <outages-discussion-
>bounces at outages.org> On Behalf Of Frank Bulk
>	Sent: Thursday, December 27, 2018 9:02 AM
>	To: outages-discussion at outages.org
>	Subject: Re: [Outages-discussion] CenturyLink Outages this
>morning
>
>
>
>	Thanks for sharing.  IPv6 access to www.qwest.com
><https://urldefense.proofpoint.com/v2/url?u=http-
>3A__www.qwest.com&d=DwMF-g&c=-7HNwxqfpkdcRXCW8HB54Q&r=54dbn-
>p1oLZUKMLcLIhi8XEoEqm1EAqlQUibNY0yxyg&m=MZI5frpDmYAKCJ3LImVo75vY0h4gV
>uUnYEgmLhjxBPc&s=An5q7Z3-RdwFPPQ1wVimppYrcLzYIv3sRw2vDYhT4mw&e=>  and
>www.centurylink.com <https://urldefense.proofpoint.com/v2/url?u=http-
>3A__www.centurylink.com&d=DwMF-g&c=-7HNwxqfpkdcRXCW8HB54Q&r=54dbn-
>p1oLZUKMLcLIhi8XEoEqm1EAqlQUibNY0yxyg&m=MZI5frpDmYAKCJ3LImVo75vY0h4gV
>uUnYEgmLhjxBPc&s=Nu02qgI6ZbcvF86z7lerio9zVCJHCW42q2uf_GMqY7k&e=>  has
>been intermittent since 5:00 am U.S. Central.
>
>
>
>	Frank
>
>
>
>	From: Outages <outages-bounces at outages.org> On Behalf Of Erik
>Sundberg via Outages
>	Sent: Thursday, December 27, 2018 8:45 AM
>	To: outages at outages.org
>	Subject: [outages] CenturyLink Outages this morning
>
>
>
>	We are seeing a lot of centurylink circuits and services down
>this morning.
>
>
>
>	Centurylink Control Center portal comes up but auth failing
>unable to login
>
>
>
>	Waves down or bouncing
>
>	Denver - Seattle (Bouncing)
>
>	Denver - Chicago (Bouncing)
>
>	New York - Los Angeles (Down)
>
>	Chicago - Atlanta (Down)
>
>
>	ENNI Down in Denver (Down)
>
>
>
>	Still tying to open tickets with them there portal is not
>working and stuck on hold right now.
>
>
>
>
>
>________________________________
>
>
>	CONFIDENTIALITY NOTICE: This e-mail transmission, and any
>documents, files or previous e-mail messages attached to it may
>contain confidential information that is legally privileged. If you
>are not the intended recipient, or a person responsible for
>delivering it to the intended recipient, you are hereby notified that
>any disclosure, copying, distribution or use of any of the
>information contained in or attached to this transmission is STRICTLY
>PROHIBITED. If you have received this transmission in error please
>notify the sender immediately by replying to this e-mail. You must
>destroy the original transmission and its attachments without reading
>or saving in any manner. Thank you.
>
>	This email and its attachments may contain privileged and
>confidential information and/or protected health information (PHI)
>intended solely for the use of Netsmart Technologies and the
>recipient(s) named above. If you are not the recipient, or the
>employee or agent responsible for delivering this message to the
>intended recipient, you are hereby notified that any review,
>dissemination, distribution, printing or copying of this email
>message and/or any attachments is strictly prohibited. If you have
>received this transmission in error, please email compliance at NTST.com
><mailto:compliance at NTST.com>  immediately and permanently delete this
>email and any attachments.
>
>	This email and its attachments may contain privileged and
>confidential information and/or protected health information (PHI)
>intended solely for the use of Netsmart Technologies and the
>recipient(s) named above. If you are not the recipient, or the
>employee or agent responsible for delivering this message to the
>intended recipient, you are hereby notified that any review,
>dissemination, distribution, printing or copying of this email
>message and/or any attachments is strictly prohibited. If you have
>received this transmission in error, please email compliance at NTST.com
>immediately and permanently delete this email and any attachments.
>	_______________________________________________
>	Outages-discussion mailing list
>	Outages-discussion at outages.org
>	https://puck.nether.net/mailman/listinfo/outages-discussion
>
>
>
>
>--
>
>
>WeWork | Erik Wooding
>Manager of Network Engineering, US & Canada West, Latin America
>O: 917-810-9345
>wework.com <http://www.wework.com>
>
>Create Your Life's Work
>
>Get rewarded for good ideas and good people!
>Apply for funding at creatorawards.wework.com
><http://creatorawards.wework.com/>  or
>help grow the community at refer.wework.com






More information about the Outages-discussion mailing list