[c-nsp] RAM thing
Phil Mayers
p.mayers at imperial.ac.uk
Mon Feb 17 19:47:53 EST 2014
On 17/02/2014 20:53, Saku Ytti wrote:
> On (2014-02-17 19:13 +0000), Phil Mayers wrote:
>
>> As someone else has pointed out, the Cisco description is of a
>> sudden hard failure triggered by a power cycle, not some kind of
>> progressive degradation AFAICT.
>>
>> Do you have information to the contrary?
>
> No. It was more of a general question of what type of memory failure is
> acceptable and will we pay premium for product which has more graceful
> failure-modes.
Honestly, I think we're a long way from the hardware failure modes being
the main issue of modern networking devices... most of it can't even do
the job it's advertised for, for months or years after release until
software stabilises.
Personally I think the blatant inability to deliver reliable software is
more of a threat than hardware failure right now. But then I'm feeling
particularly grumpy as I have 9 support cases open with 3 vendors right
now...
(At this precise moment in time, I'd settle for an edge switch which can
do decent DHCP/IPv6 security without costing over £4k and being <90cm
deep, a core router with MPLS and working netflow, and a firewall that
didn't crash when you typed "show session". The quality of the RAM
inside is so far down my list of complaints it's not even funny...)
But yes, in theory, I don't mind paying more for ECC RAM and things like
GOLD, ability to degrade to a subset of fabric channels, and so on. To
what degree is hard to quantify - they're not optional extras on the
platforms that have them, and we buy those platforms for other, feature
reasons.
> I recently ran into (likely) memory issue which caused sporadic corruption in
> IP header, had it occurred anywhere else than IP headers or were we IPv6 only,
> it would have been invisible to me. Is there 98.7% probability that some kit
> in my network is currently corrupting packets outside header?
Well, a 98.7% probability per-packet is obviously catastrophic.
0.1% is pretty terrible too.
10e-12 is OTOH negligible.
So clearly it's not binary yes/no. I suspect for most operators there's
a sweet spot in pricing that is a function of what upper-layers (and
thus customers) will tolerate, price, and what else you lose in the
tradeoff.
More information about the cisco-nsp
mailing list