[Outages-discussion] [outages] Outages Message List Scope

John Starta john at starta.org
Thu Mar 9 13:54:27 EST 2017


As you might expect there are a variety of opinions on what’s important enough to merit an outages mailing list message. I personally don’t have an issue with smaller or website specific outages being reported. I view them as potential canaries in the coal mine. For instance, an issue with Slack can signal an AWS problem which might not have been registered on Amazon’s dashboard yet. (Given what resides on AWS, Azure, and Google Cloud these days I personally think they qualify as major [communications] infrastructure.)

Everyone on this mailing list should have the technical skill to know how to utilize the filtering capabilities of their mail clients and/or server. If you don’t like hearing about Slack outages, for instance, then create filters to parse incoming mailing list messages for only keywords important to you. I would recommend using server-side filters so that both your mobile[1] and desktop can benefit. If server-side filters aren’t available to you, then consider getting a Gmail account which does and subscribe to the outages list from there.

Simply put: Why must everyone on the list be deprived of potentially useful information just because some don’t / won’t learn their tools to benefit themselves.

John Starta

[1] A frequent complaint of many is X outage shouldn’t be reported to this list — to paraphrase: “stop filling the inbox on my mobile with outages I find unimportant.”


> On Mar 9, 2017, at 10:58 AM, Peter Beckman <beckman at angryox.com> wrote:
> 
> Agreed, and thanks for saying something.
> 
> Though, from some standpoints, Slack being down could be considered a
> communications failure... but I agree, clarification should be given.
> 
> Appropriate posts (based on my reading of the "Mission Statement"):
> 
>    * Network link down or packet loss (show your work)
>    * Telecommunications issue (voice, video, SMS)
>    * BGP flaps
>    * DDOS affecting network latency
>    * generally packet delivery and receipt related
> 
> Maybe appropriate:
> 
>    * Large cloud/service provider outage (some communication may be
>        dependent); e.g recent AWS S3 US-East issue, CloudFlare Security issue
>    * AT&T 911 Outage
>    * Mobile Network outage/issue
>    * MicroTik Zero Day
> 
> Not appropriate:
> 
>    * Web service is down (endpoint related) e.g. Slack, Twitter, Amazon
>        retail, Facebook
>    * "I'm seeing a problem, are you?" posts -- either know or don't post
>    * "Me too" posts unless you are adding to the discussion with
>       additional, not before seen detail
>    * After-the-fact posts "Yeah, I saw that happen"
> 
> I do not represent the Outages list, this is my personal take on what
> should or shouldn't be here.
> 
> Oh, and this should go to -discussion.
> 
> On Thu, 9 Mar 2017, Jason Grider via Outages wrote:
> 
>> Is a 504 – Gateway Error something that falls into this mailing list’s
>> scope? I’m asking because I’ve only been subscribed for a few days and
>> between a 10 minute outage at Slack generating ~25 messages over the
>> course of 5 hours and an (most likely internal to Invidia) HTTP server
>> error I’m wondering how much filtering I may need to put in place to get
>> the pieces of information I need from the chaff of events that have a
>> limited blast radius. There is no disrespect intended to anybody about
>> what has been sent to the list. I’m just trying to set my expectations in
>> line with the data I’ve asked for. :)
>> 
>> From the signup page: https://puck.nether.net/mailman/listinfo/outages (bold is my emphasis)
>> "The primary goal of this mailing list ("outages") is for
>> outages-reporting that would apply to failures of major communications
>> infrastructure components having significant traffic-carrying capacity,
>> similar to what FCC provided prior to 9/11 days but they seem to have
>> pulled back due to terrorism concerns. Some also believe that LEC's and
>> IXC's also like this model as they no longer have to air their dirty
>> laundry. Then again, this mailing list is not about making anyone look
>> bad, its all about information sharing and keeping network operators &
>> end users abreast on the situation as close to real-time information as
>> possible in order to assess and respond to major outage such as routing
>> voice/data via different carriers which may directly or indirectly impact
>> us and our customers. A reliable communications network is essential in
>> times of crisis.
>> 
>> The purpose of this list is to have a central place to lookup and report
>> so that end users & network operators know why their services (e-mail,
>> phones, etc) went down eliminating the need to open tons of trouble
>> tickets during a major event. One master ticket - such as fiber cut
>> affect xxx OC48's would suffice. We hope this would empower users and
>> network operators to post such events so that everyone could benefit from
>> it. “
>> 
>> Then again “OC48” may date that statement a little bit. :)
>> 
>> “Thank you and have a great day!” I say from the confines of my shiny
>> silver flame suit.
>> Jason Grider
>> 
>> 
> 
> ---------------------------------------------------------------------------
> Peter Beckman                                                  Internet Guy
> beckman at angryox.com                                 http://www.angryox.com/
> ---------------------------------------------------------------------------



More information about the Outages-discussion mailing list