[Outages-discussion] [outages] So, when Twitter goes down
Larry Sheldon
LarrySheldon at cox.net
Mon Aug 24 16:53:19 EDT 2009
Jay R. Ashworth wrote:
> where do you announce it? :-)
One of the great conundra (tm) of our times.
When I was active in the field I lost more arguments with management
over this kind of issue:
Notify critical people via pagers when:
The power fails.
The supervising server fails.
The campus telephone system fails.
Any of a number of other things fail, where the system might actually
work.
The solution to this one? A system not subject to the failures the
object system is to report loss of heart-beat data from the object
system. Leaves only the question about a failure big enough to ake out
both systems.[1]
Eliminate paper records by maintaining the indices to the back-up media
on the machine being backed up.[2]
There are others.
[1] Carry the second system around with you, you say? Think aboutthat
for a while.
[2] I know of answers to this, but they involve spending money that
generates no income, so of course they are not interesting. (Preventing
the loss of money is interesting to me--probably why I never became a
big manager.)
--
Requiescas in pace o email Two identifying characteristics
of System Administrators:
Ex turpi causa non oritur actio Infallibility, and the ability to
learn from their mistakes.
Eppure si rinfresca
ICBM Targeting Information:
http://tinyurl.com/4sqczs
http://tinyurl.com/7tp8ml
More information about the Outages-discussion
mailing list