[Outages-discussion] CenturyLink Outages this morning

frnkblk at iname.com frnkblk at iname.com
Thu Dec 27 22:48:20 EST 2018


I wonder why they want to remove the secondary communication channel – is it because they believe it’s helping facilitate the broadcast traffic? Or is to simplify the application of filters? Or simplify troubleshooting of the issue?

 

I don’t know what Infinera platforms are involved with this issue, but Figure 3-24 of this (very old) document shows a Control Path B/secondary control path:

https://fccid.io/ANATEL/01163-09-05381/Manual-02/072CE5D9-762C-49CE-B9F0-FEC86BF9077B/PDF

 

Frank 

 

From: Erik Wooding <erik.wooding at wework.com> 
Sent: Thursday, December 27, 2018 9:06 PM
To: Biddle, Josh <JBiddle at ntst.com>
Cc: frnkblk at iname.com; outages-discussion at outages.org
Subject: Re: [Outages-discussion] CenturyLink Outages this morning

 

They're definitely lost. Last troubleshoot was a card issue in Denver which didn't change anything. 

 

These are the updates from the ticket we have with them if anyone isn't getting these.  

 

------------------------------------------------------------------------------------------------------------------------

 

2018-12-28 02:02:14 GMT - Once the card was removed in Denver, CO it was confirmed that there was no significant improvement. Additional packet captures, and logs will be pulled from the device with the card removed to further isolate the root cause. The Equipment vendor continues to work with CenturyLink Field Operations at multiple sites to remove the secondary communication channel tunnel across the network until full visibility can be restored. The equipment vendor has identified a number of additional nodes that visibility has been restored to, and their engineers are currently working to apply the necessary filter to each of the reachable nodes. 

 

2018-12-28 01:03:11 GMT - Following the review of the logs and packet captures, the Equipment Vendor驴s Tier IV Support team has identified a suspected card issue in Denver, CO.驴驴Field Operations has arrived on site and are working in cooperation with the Equipment Vendor to remove the card. 

 

2018-12-28 00:01:53 GMT - The Equipment Vendor is currently reviewing the logs and packet captures from devices that have been completed, while logs and packet captures continue to be pulled from additional devices. The necessary teams continue to remove a secondary communication channel tunnel across the network until visibility can be restored. All technical teams continue to diligently work to review the information obtained in an effort to isolate the root cause. 

 

2018-12-27 22:58:43 GMT - Multiple teams continue work to pull additional logs and packet captures on devices that have had visibility restored, which will be scrutinized during root cause analysis. The Tier IV Equipment Vendor Technical Support team in conjunction with Field Operations are working to remove a secondary communication channel tunnel across the network until visibility can be restored. The Equipment Vendor Support team has dispatched their Field Operations team to the site in Chicago, IL and has been obtaining data directly from the equipment. 

 

2018-12-27 21:36:58 GMT - It has been advised that visibility has been restored to both the Chicago, IL and Atlanta, GA sites. Engineering and Tier IV Equipment Vendor Technical Support are currently working to obtain additional logs from devices across multiple sites including Chicago and Atlanta to further isolate the root cause. 

 

2018-12-27 20:19:36 GMT - Tier IV Equipment Vendor Technical Support continues to work with CenturyLink Field Operations and Engineering to restore visibility and apply the filter to devices in Atlanta, GA and Chicago, IL. While those efforts are ongoing additional logs have been pulled from the devices in Kansas City, MO and New Orleans, LA following the restoral of visibility and the necessary filter application to obtain additional pertinent information now that the device is remotely accessible. 

 

2018-12-27 19:30:01 GMT - Spoke with Sean advised we were seeing system reestored he verified and advised his other circuit is showing restored at the same time as this circuit Related to nationwide outage. adding to parent ticket Sean request we leave this ticket open until they are handsoff please when you are available could you send us your traceroutes thank you 

 

2018-12-27 19:17:01 GMT - Efforts to regain visibility to sites in Atlanta, GA and Chicago, IL remain ongoing. Once visibility has been restored the filter will be applied to limit communication traffic between sites which was causing CPU spikes that in turn prevented the devices from functioning properly. 

 

2018-12-27 18:13:33 GMT - The equipment vendor驴s Tier IV Technical Support team continued to investigate the equipment logs to further assist with isolation. Field Operations were dispatched to multiple sites to investigate equipment in Kansas City, MO, New Orleans, LA, as well as Atlanta, GA. It has been advised that a controller card has been stabilized in New Orleans, LA restoring visibility to the equipment to allow additional investigations to continue. A filter was then applied to equipment in Kansas City, MO that further alleviated the impact observed. Investigations remain ongoing to further restore network services. 

 

2018-12-27 15:46:56 GMT - Following the isolation of a node in San Antonio, TX that alleviated some of the impact the necessary teams have shifted troubleshooting focus to additional nodes experiencing issues. A node in Atlanta, GA as well as a site in New Orleans, LA are currently being investigated. 

 

2018-12-27 14:44:42 GMT - Field Operations dispatched to various locations to troubleshoot cooperatively with the equipment vendor. During cooperative troubleshooting, a device in San Antonio, TX was seeming to broadcast traffic consuming capacity and impacting other nodes in the network. The node was isolated from the network, which appears to have alleviated some impact; however, troubleshooting efforts continue to restore all impacted services. We understand how important these services are to our clients and the issue has been escalated to the highest levels within CenturyLink Service Assurance Leadership. 

 

2018-12-27 13:50:05 GMT - On December 27, 2018 at 02:40 GMT, CenturyLink identified a service impact in various locations. The NOC engaged to begin investigations. The NOC engaged Tier IV Vendor Support to assist in troubleshooting and fault isolation efforts. The NOC engaged Field Operations to cooperatively troubleshoot. Field Operations arrived on site in Denver, CO. Field Operations dispatched additional technicians to Kansas City, MO and Omaha, NE. The NOC is continuing to cooperatively troubleshoot with Tier IV Vendor Support for a site in San Antonio, TX. 

 

2018-12-27 13:14:12 GMT - Field Operations has arrived on site at the Denver location. 

 

2018-12-27 12:49:55 GMT - The NOC has engaged Tier IV Vendor Support to assist in troubleshooting efforts. 

 

2018-12-27 12:31:34 GMT - Field Operations has dispatched a second technician to another site to assist with troubleshooting and fault isolation efforts. An ETA of 13:30 GMT has been provided. 

 

2018-12-27 12:21:46 GMT - The NOC has engaged Field Operations to assist in troubleshooting and fault isolation efforts. An ETA of 13:15 GMT has been provided.

 

 2018-12-27 11:57:40 GMT - On December 27, 2018 at 02:40 GMT, CenturyLink identified a service impact in New Orleans, LA. The NOC is engaged and investigating in order to isolate the cause. Please be advised that updates for this event will be relayed at a minimum of hourly unless otherwise noted. The information conveyed hereafter is associated to live troubleshooting effort and as the discovery process evolves through to service resolution, ticket closure, or post incident review, details may evolve.

 

On Thu, Dec 27, 2018 at 6:59 PM Biddle, Josh <JBiddle at ntst.com <mailto:JBiddle at ntst.com> > wrote:

///Thanks.  This is one for the books I guess.  I just got in to work a few hours ago so I am playing catch up.  It sounds to me like they still don’t know what happened.  In that thread I see multiple reports of circuits going into loopback.  I’m guessing this is CL’s way of trying to segregate parts of their core infrastructure in attempt to kill whatever is broadcasting and reconverge their network.  It sounds like they still have not figured out the root cause. 

 

Anyone have any thoughts or updates?

 

 

From: frnkblk at iname.com <mailto:frnkblk at iname.com>  <frnkblk at iname.com <mailto:frnkblk at iname.com> > 
Sent: Thursday, December 27, 2018 9:10 PM
To: Biddle, Josh <JBiddle at ntst.com <mailto:JBiddle at ntst.com> >
Subject: RE: [Outages-discussion] CenturyLink Outages this morning

 

This Reddit post is probably the best/closes to what you’re asking for: https://old.reddit.com/r/networking/comments/a9z6tb/centurylink_outage_west_coast/ecnsbct/ <https://urldefense.proofpoint.com/v2/url?u=https-3A__old.reddit.com_r_networking_comments_a9z6tb_centurylink-5Foutage-5Fwest-5Fcoast_ecnsbct_&d=DwMFAg&c=-7HNwxqfpkdcRXCW8HB54Q&r=svX1Si7sopSBMitBL3bFwQ&m=W4lc_zzFTBgLx4j-BCYt1OS6aM_k2UNM7ApSaARhJ08&s=aOKuybu-jzAMU3kMtXlyoHlJowOoZk7j8l5nWAnrqoo&e=> 

 

Frank 

 

From: Outages-discussion <outages-discussion-bounces at outages.org <mailto:outages-discussion-bounces at outages.org> > On Behalf Of Biddle, Josh
Sent: Thursday, December 27, 2018 6:34 PM
To: outages-discussion at outages.org <mailto:outages-discussion at outages.org> 
Subject: Re: [Outages-discussion] CenturyLink Outages this morning

 

Anyone have root cause of Century Link outage and ETTR?

 

 

From: Outages-discussion <outages-discussion-bounces at outages.org <mailto:outages-discussion-bounces at outages.org> > On Behalf Of Frank Bulk
Sent: Thursday, December 27, 2018 6:42 PM
To: outages-discussion at outages.org <mailto:outages-discussion at outages.org> 
Subject: Re: [Outages-discussion] CenturyLink Outages this morning

 

My IPv6 access to www.qwest.com <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.qwest.com&d=DwMFAg&c=-7HNwxqfpkdcRXCW8HB54Q&r=svX1Si7sopSBMitBL3bFwQ&m=W4lc_zzFTBgLx4j-BCYt1OS6aM_k2UNM7ApSaARhJ08&s=Dw0K5qO9ViQuAnU-lsg-DS_zFzzr3sACarc7Az1k7o8&e=>  and www.centurylink.com <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.centurylink.com&d=DwMFAg&c=-7HNwxqfpkdcRXCW8HB54Q&r=svX1Si7sopSBMitBL3bFwQ&m=W4lc_zzFTBgLx4j-BCYt1OS6aM_k2UNM7ApSaARhJ08&s=QlmOIb-_7Xxxz5CMZT6NxzOo9JvY2bcQsR61xQ91dK0&e=>  has been stable since 3:29 pm U.S. Central.

 

downdetector.com <http://downdetector.com>  has flatlined for a while, but still not close to zero, and based on what I see in the reddit thread, the issue is not resolved.

 

Frank 

 

From: Outages-discussion <outages-discussion-bounces at outages.org <mailto:outages-discussion-bounces at outages.org> > On Behalf Of Frank Bulk
Sent: Thursday, December 27, 2018 9:02 AM
To: outages-discussion at outages.org <mailto:outages-discussion at outages.org> 
Subject: Re: [Outages-discussion] CenturyLink Outages this morning

 

Thanks for sharing.  IPv6 access to www.qwest.com <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.qwest.com&d=DwMF-g&c=-7HNwxqfpkdcRXCW8HB54Q&r=54dbn-p1oLZUKMLcLIhi8XEoEqm1EAqlQUibNY0yxyg&m=MZI5frpDmYAKCJ3LImVo75vY0h4gVuUnYEgmLhjxBPc&s=An5q7Z3-RdwFPPQ1wVimppYrcLzYIv3sRw2vDYhT4mw&e=>  and www.centurylink.com <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.centurylink.com&d=DwMF-g&c=-7HNwxqfpkdcRXCW8HB54Q&r=54dbn-p1oLZUKMLcLIhi8XEoEqm1EAqlQUibNY0yxyg&m=MZI5frpDmYAKCJ3LImVo75vY0h4gVuUnYEgmLhjxBPc&s=Nu02qgI6ZbcvF86z7lerio9zVCJHCW42q2uf_GMqY7k&e=>  has been intermittent since 5:00 am U.S. Central.

 

Frank

 

From: Outages <outages-bounces at outages.org <mailto:outages-bounces at outages.org> > On Behalf Of Erik Sundberg via Outages
Sent: Thursday, December 27, 2018 8:45 AM
To: outages at outages.org <mailto:outages at outages.org> 
Subject: [outages] CenturyLink Outages this morning

 

We are seeing a lot of centurylink circuits and services down this morning.

 

Centurylink Control Center portal comes up but auth failing unable to login

 

Waves down or bouncing 

Denver - Seattle (Bouncing)

Denver - Chicago (Bouncing)

New York - Los Angeles (Down)

Chicago - Atlanta (Down)


ENNI Down in Denver (Down)

 

Still tying to open tickets with them there portal is not working and stuck on hold right now.

 

 

  _____  


CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain confidential information that is legally privileged. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error please notify the sender immediately by replying to this e-mail. You must destroy the original transmission and its attachments without reading or saving in any manner. Thank you.

This email and its attachments may contain privileged and confidential information and/or protected health information (PHI) intended solely for the use of Netsmart Technologies and the recipient(s) named above. If you are not the recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any review, dissemination, distribution, printing or copying of this email message and/or any attachments is strictly prohibited. If you have received this transmission in error, please email compliance at NTST.com <mailto:compliance at NTST.com>  immediately and permanently delete this email and any attachments. 

This email and its attachments may contain privileged and confidential information and/or protected health information (PHI) intended solely for the use of Netsmart Technologies and the recipient(s) named above. If you are not the recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any review, dissemination, distribution, printing or copying of this email message and/or any attachments is strictly prohibited. If you have received this transmission in error, please email compliance at NTST.com <mailto:compliance at NTST.com>  immediately and permanently delete this email and any attachments. 

_______________________________________________
Outages-discussion mailing list
Outages-discussion at outages.org <mailto:Outages-discussion at outages.org> 
https://puck.nether.net/mailman/listinfo/outages-discussion




 

-- 

	

WeWork | Erik Wooding 
Manager of Network Engineering, US & Canada West, Latin America 
O: 917-810-9345 
 <http://www.wework.com> wework.com 

Create Your Life's Work 

Get rewarded for good ideas and good people! 
Apply for funding at  <http://creatorawards.wework.com/> creatorawards.wework.com or 
help grow the community at  <http://refer.wework.com> refer.wework.com

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/outages-discussion/attachments/20181227/022e90c4/attachment-0001.html>


More information about the Outages-discussion mailing list