[Outages-discussion] [outages] So, when Twitter goes down
    Larry Sheldon 
    LarrySheldon at cox.net
       
    Mon Aug 24 16:53:19 EDT 2009
    
    
  
Jay R. Ashworth wrote:
> where do you announce it?  :-)
One of the great conundra (tm) of our times.
When I was active in the field I lost more arguments with management 
over this kind of issue:
Notify critical people via pagers when:
   The power fails.
   The supervising server fails.
   The campus telephone system fails.
   Any of a number of other things fail, where the system might actually
    work.
The solution to this one?  A system not subject to the failures the 
object system is to report loss of heart-beat data from the object 
system.  Leaves only the question about a failure big enough to ake out 
both systems.[1]
Eliminate paper records by maintaining the indices to the back-up media 
on the machine being backed up.[2]
There are others.
[1] Carry the second system around with you, you say?  Think aboutthat 
for a while.
[2] I know of answers to this, but they involve spending money that 
generates no income, so of course they are not interesting.  (Preventing 
the loss of money is interesting to me--probably why I never became a 
big manager.)
-- 
Requiescas in pace o email              Two identifying characteristics
                                              of System Administrators:
Ex turpi causa non oritur actio        Infallibility, and the ability to
                                              learn from their mistakes.
Eppure si rinfresca
ICBM Targeting Information:
	http://tinyurl.com/4sqczs
	http://tinyurl.com/7tp8ml
	
    
    
More information about the Outages-discussion
mailing list