Cisco is being a little coy about it for my thoughts (although I can understand why they are from a business perspective). As an example we upgraded IOS version on a 7609 late last year which has two ES20 cards in it. After the reboot one of the cards failed to come back up with what seemed to be a crash on boot due to memory issues. The failed ES20 card had the same version FW & HW as the one in the slot beside it that was working. TAC said that it probably had some small memory problem and that the diagnostics were more strict in the new IOS version and so were failing the card instead of letting it boot. Looking at this memory issue makes me wonder now if that wasn't the real cause ? The card was RMA'd and so we have a working one again, but I'm left wondering whether the same thing might happen on an upgrade on any of our other numerous 7609's that are out there (or any other Cisco hardware).

At least some information on failure rates might be a bit more helpful in planning (ie. 1/10000 - you just got unlucky or 1/10 and you need to definitely plan for a hardware failure to happen each time you reboot the box). It kinda makes upgrading due to bug/security fixes a little more fraught with danger as the box may never come back after the reboot...


I'm sure most people have seen this, but for those who haven't:


tl;dr - faulty RAM in a bunch of Cisco (and it is implied, other 
vendors) kit from ca. 2005-2010 suffering sudden death on power cycle, 
across many product ranges.

They downplay it somewhat in the FAQ - let's hope it really is only a 
minor thing.
