[c-nsp] Parity Errors and Cosmic Rays

Robert Holtz robert.d.holtz at gmail.com
Thu May 5 12:38:58 EDT 2005


There's one other possibility: Several Tiawan memory manufactures have
been shipping untested DRAM chips straight from production.

http://hardware.itmanagersjournal.com/hardware/05/04/18/194256.shtml?tid=104&tid=78

Above is a URL with a story about the practice.

On 5/5/05, Chris Roberts <croberts at bongle.co.uk> wrote:
> > Is this actually a common problem? Or at least common enough
> > that I should expect to see it every other month or so? It
> > seems strange that this router has run for years and we've
> > never seen a memory parity error and now we've seen three in
> > three months.
> >
> 
> Sometime last year, we started seeing memory parity errors on our 7507s.
> This was affecting one card. This gradually spread over the course of around
> a month to 3 cards in the same platform, the first two of which were
> replaced. This then spread to another chassis in the same rack, which then
> started losing cards at the same rate over the course of a month. (See my
> mails to this list at around the same time with around the same kind of
> content as yours). I'd run 7505s at other ISPs for ~5 or more years and
> never seen anything like this. Cisco simply wanted to replace each of the
> offending items of hardware, however this was not fixing the spread. We then
> lost a PA-GE with parity errors in one of our 7206s in another rack in the
> same suite.
> 
> After much sobbing we took the 7507s out and upgraded our 6509s to Sup720s,
> which so far have been rock solid, besides some installation issues and
> teething problems. I realise this isn't a possibility for everyone though.
> 
> Some things that were suggested at the time:
> * Cosmic rays
> * Static protection in your data centres
> * Metal filings getting into kit from people chopping floor tiles and such
> and getting into the aircon
> * Failing PSUs
> 
> Also, our offending 7507s were getting old (3-4 years apparently), but had
> always been rock solid. I suspect it may have just been age that killed them
> in the end, we never did find any trace of any of the above, although
> obviously static and cosmic rays are hard to prove. At the time it was also
> suggested that the TAC would be able to test the returned cards and provide
> you with some kind of breakdown of the failure mode of the card and let you
> know which components they had to replace, but that they would be loathe to
> do this. Sure enough we requested the TAC do this, and they were loathe to
> do it, and we've never followed this up as we still have most of the dead
> cards and didn't RMA them, but I guess that might be something you may want
> to do.
> 
> > Any thoughts?
> >
> > Thanks,
> > John
> 
> Cheers,
> Chris.
> 
> ---
> Outgoing mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.859 / Virus Database: 585 - Release Date: 14/02/2005
> 
> 
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>



More information about the cisco-nsp mailing list