[j-nsp] How to pick JUNOS Version

Thu Sep 3 14:26:13 EDT 2020

On 2/Sep/20 09:37, Saku Ytti wrote:

> How do people measure this? Vendors spend tens or hundreds millions
> annually on testing, and still deliver absolute trash NOS, to every
> vendor, and there is no change that I can observe +20 years in
> quality. Basic things are broken, and everyone finds new fundamental
> bugs all the time.
>
> I think NOS are shit, because shit NOS is a good business case and
> good NOS is a bad business case, I know it sounds outrageous, but let
> me explain. Vendor revenue is support contract, not HW sales. And a
> lot of us don't need help on configuring or troubleshooting, a lot of
> us have access to community which outperforms TAC on how to get that
> box working. But none of us has access to the code, we can't commit
> and push a fix. If the NOS would work, like Windows, Macos or Linux
> that you rarely find bugs, a lot of us would opt out from support
> contracts, and would just buy spare HW, destroying the vendor's
> business case.
>
> I don't think vendors sit in scary skull towers and plans for shit
> NOS, I think it's emergent behaviour from how the market is modelled.
> And there are ways I think the market could change, but I'm already
> venturing too far from the question to explore that topic.
>
>
>
> Now when it comes to testing, many claim it is important and it
> matters. I'm not convinced. And I don't think people are looking at
> this in any formality, it's more like religion, and its utility is to
> improve comfort-to-deploy in the organisation, it doesn't do much
> towards probability-of- success in my mind. I've worked for companies
> who test not at all, companies who boot it in the lab and companies
> who had a team doing just testing, and I can't say I've seen different
> amounts of TAC cases on software issues.
>
> People who invest lot on testing, and are not comfortable with idea
> that value is just 'comfort-to-deploy' (that may be sufficiently
> important value), I recommend looking at TAC cases you had which
> actually did cause customer outage, then try to evaluate 'was this
> reasonable to catch in the lab', try to be honest.
> The problem I see, whole NOS quality is shit, it's not so shit that
> it's always broken, the problems that manifest require usually more
> than one condition, then if you start to do back-of-the-envelope math
> on testing everything with every permutations, you will notice no
> amount of money can fix the fact that you're limited by
> heat-death-of-universe on the wall clock. So now you're venturing into
> an area where you gotta choose, what to test and what not to test, and
> you don't have nearly enough outages and faults to apply statistical
> analysis on it, so you're actually just guessing.
> It's religion, which has some utility, but not the utility we think it has.
>
>
> Note I'm not saying testing wholesale is useless, I'm more saying it
> has an exponentially or worse diminishing return. I would say push 1
> packet through all your products in the lab, and you're done, you're
> as far as you're reasonably gonna get.
> And start thinking in terms 'the NOS is shit and I exercise no power
> over it', what actions work in that world? Staging pop with real but
> outage insensitive subscriptions?

100% agreed with everything Saku said here.

As we age, we need to pick our battles, since what's real and important,
begins to reveal itself.

Mark.