[outages] Google outage after-action

Conrad Heiney conrad at fringehead.org
Sat Jan 25 19:43:18 EST 2014


Centralized configuration is necessary. It is also the new SPOF.

On Saturday, January 25, 2014, Lori Barfield <itdirector at gmail.com> wrote:

> On Jan 25, 2014 11:05 AM, "Jay Ashworth" <jra at baylink.com<javascript:_e({}, 'cvml', 'jra at baylink.com');>>
> wrote:
> >
> > Ironically, Google's Site Reliability Engineering team *was doing an AMA
> > when the outage hit*.
> > Highlights:
>
> "Nearly all of our problems are caused by changes to our systems (either
> human or automated), so the first step is playing the 'what is different
> game.'"
>
> so here i think we have the definition of "Big," in Internet terms:  the
> only real danger is from internal threats.
>
> ...lori
>


-- 
Sent from Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/outages/attachments/20140125/3946faa6/attachment.htm>


More information about the Outages mailing list