[j-nsp] SSD in M7i

Richard A Steenbergen ras at e-gerbil.net
Tue Feb 9 15:58:39 EST 2010


On Tue, Feb 09, 2010 at 12:30:49PM -0600, TCIS List Acct wrote:
> Not had any issues yet, but we're doing a round of h/w upgrades on our
> M7i's and figured this was the time to do it if we were going to do it
> at all.

I've probably had my hands on about 500 RE-2.0s over the years, many of
which were in constant use for 5-10 years at a stretch, and in my
experience the failure rate was actually much higher on the CF media
than it was on the spinning HD. Obviously the old CF media is nowhere
near the same quality as a modern SSD, it had no mechanisms for wear
leveling or gracefully handling bad sectors, but the point is that the
knee-jerk reaction of "omg spinning media in my routers is going to doom
us all" actually caused far more downtime than it prevented in the long
run. Regular old hard drives work just fine, and of all the hw failures
I've seen on these routers over the years the HDs were actually one of
the least common ones.

Now with regard to the problems on the M7i/M10i REs, whatever happened
there seemed to be specific to those RE-4.0/5.0s. One theory is that
they chose to save $20 by using "laptop grade" (only rated for 4 hours
of use a day) rather than "blade server grade" (rated for 24/7 use), and
it came back to bite them. The drives in the older RE-2.0s were all
IBM/Hitachi Travelstar's from before they had such a concept as "blade
server rated drives", and they didn't have anything like the failure
rates on the M7i/M10i. Maybe the HD manufacturers started using cheaper
lower quality parts in the "laptop grade" units after that, I don't know
enough about the manufacturing process to say anything intelligent on
the subject. All we really know is that at the end of the day Juniper
refused to admit there was a problem with the HDs (despite massive end
user complaints about the failure rates), made up some Cisco "your IOS
crashes are all caused by cosmic rays hitting the ram" grade bullshit
about "excessive logging" causing problems (even though most routers
didn't log that much, and nobody ever had problems logging that much on
other REs), and simply RMA'd the REs as they broke. Thankfully nobody
ever died over a had RE HD, and there are no accelerator pedals on the
routers, so a little cover-up was probably to be expected and I doubt 
anybody will ever be called to testify in front of congress. :)

-- 
Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)


More information about the juniper-nsp mailing list