[c-nsp] N7K, SUP1, M1/M2/F2E, 6.2(10)
Phil Mayers
p.mayers at imperial.ac.uk
Tue Dec 2 07:45:25 EST 2014
On 02/12/14 08:45, Saku Ytti wrote:
> For what problematic old fashioned architecture/design JunOS has, I've only
> ever seen similar programming issues due to ISSU in JunOS.
>
> I also don't see these issues in other Cisco kits, CRS1, ASR1k, ASR9k. I
> wonder if CSCO has recognized the same, or are these issues just treated as
> independent bugs rather than indication of some larger problem.
> Or am I seeing pattern where none exists?
I think the pattern is real. Like you, I've seen a surprisingly high
number of FIB misprogramming on this platform over the years.
I'm assuming EARL7/8 have some specific characteristics that trigger
these, for example strict timing or ordering requirements when
programming, and that perhaps the IOS architecture - coop/yielding
multitasking - make bugs in this area likely.
No idea why it happens on NX-OS but I assume they're re-using the HAL
and probably the bugs live in there.
What I find most frustrating is that you can't "clear [mls|hardware]
..." when these occur. There seem to be no way of resetting it to
known-good state and reprogramming from scratch short of a reload; I
would rather a 10 second outage whilst PFC is cleared and reprogrammed
compared to 180 second as the box is reloaded :o/
FWIW I have seem FIB-misprogram on Juniper SRX high-end boxes where the
tnp messages only propagate to 3 of 4 FPCs which causes odd problems.
They are typically easier to clear though.
More information about the cisco-nsp
mailing list