[Outages-discussion] Amazon AWS Monday Outage preliminary postmortem

George Herbert george.herbert at gmail.com
Tue Oct 23 23:32:44 EDT 2012





On Oct 23, 2012, at 8:12 PM, Jay Ashworth <jra at baylink.com> wrote:

> While they haven't yet posted an RCA, that I've seen, it is interesting
> to see that even though past outages had a "data plane issue leaks over 
> into control plane" cascade problem, and they specifically targeted that 
> to fix, it appears to have happened again in this outage.

Well, maybe.

More things could have broken lots of EBS than the net fail - unexpected failover to mgt plane - overload - sw architecture unbounded fail mode response there in April 2011.

It was not nearly as catastrophic inside, as a current client saw.  Some things down but not most.  Enough to overwhelm normal single site DR in many cases though.

RCA eagerly awaited.


George William Herbert
Sent from my iPhone


More information about the Outages-discussion mailing list