[c-nsp] NCS-5001 - MPLS L3VPN Issue

James Bensley jwbensley at gmail.com
Tue Mar 1 04:10:16 EST 2016


> On Tue, Mar 01, 2016 at 12:35:23AM +0000, Tom Hill wrote:
>> How does that affect 'network state'?

A box running version Y that was upgraded from version X is not in the
same state as a box that was erased and installed strait onto version
X (unless the vendor is going to jump through hops and show me their
inner working to prove it is in fact the same and I am wrong).

Basically what Gert said...

On 1 March 2016 at 08:22, Gert Doering <gert at greenie.muc.de> wrote:
> Well, I could see the fear that "not everything is really on the same
> level" (like, some features are missing bugfixes, etc.) and much later,
> you run into weird "it works on this device, but fails on that one, and
> both claim to run the same level of IOS XR" issues.
>
> It's an expression of distrust to the software upgrade process...

Hell yes there is distrust...

> On Tue, Mar 01, 2016 at 12:35:23AM +0000, Tom Hill wrote:
>> Either it works after upgrade, or it does not?

When have things ever been that simple? I can test a version in the
lab until the cows come home, deploy it and strait away a new bug is
triggered and it falls off the network. Over the past two years my
little faith in Cisco's ability to test their code as dropped to
absolute zero now, because of IOS-XR.

I'm not really sure what I can and can't say, seeing as I work for the
Borg they are listening everywhere all the time. Loosing speak, one
issue I will share with you as an example; two DCs, dual ASR9000s as
PEs in each DC (PE1 and PE2 in DC1, PE3 and PE4 in DC2). Customer 1 is
dual homed to both PEs in each DC.

Something in Customer 1's traffic triggered a bug on the line card on
PE1, this crashed the line card, which in turn triggered another bug
on the PE in which it is unable to reboot a stalled line card so the
whole chassis reboots. So Customer1 having binned PE1 automatically
fails over to PE2, they binned off PE2, they then failed over to PE3
in DC2 and took that out too. Thanks to BFD that all happened in about
1 second.

For no known reason (by us or TAC) PE4 held up and kept the 2 DCs
online, all PEs were the exact same make and model routers (ASR9000s),
same set of lines cards in the same slots, with the same firmware
versions, same IOS-XR versions and basically the same configurations
save for interface IPs etc. So it’s easy to see why the first three
PEs were machine-gunned down but why the fourth?

Long story short, no trust/faith what so ever from me.

Cheers,
Jams.


More information about the cisco-nsp mailing list