[j-nsp] How to pick JUNOS Version

adamv0025 at netconsultings.com adamv0025 at netconsultings.com
Mon Sep 7 09:48:25 EDT 2020


> Saku Ytti
> Sent: Wednesday, September 2, 2020 8:37 AM
> 
> On Wed, 2 Sep 2020 at 10:23, Andrew Alston
> <Andrew.Alston at liquidtelecom.com> wrote:
> 
> >   2.  Start looking at the new features - decide what may be useful -
> > if anything - and start testing to that to death - again preferably
> > before release so that the fixes can be in when it is released
> 
> How do people measure this? Vendors spend tens or hundreds millions
> annually on testing, and still deliver absolute trash NOS, to every
vendor, and
> there is no change that I can observe +20 years in quality. Basic things
are
> broken, and everyone finds new fundamental bugs all the time.
> 
> I think NOS are shit, because shit NOS is a good business case and good
NOS
> is a bad business case, I know it sounds outrageous, but let me explain.
> Vendor revenue is support contract, not HW sales. And a lot of us don't
need
> help on configuring or troubleshooting, a lot of us have access to
community
> which outperforms TAC on how to get that box working. But none of us has
> access to the code, we can't commit and push a fix. If the NOS would work,
> like Windows, Macos or Linux that you rarely find bugs, a lot of us would
opt
> out from support contracts, and would just buy spare HW, destroying the
> vendor's business case.
> 
> I don't think vendors sit in scary skull towers and plans for shit NOS, I
think it's
> emergent behaviour from how the market is modelled.
> And there are ways I think the market could change, but I'm already
> venturing too far from the question to explore that topic.
> 
> 
> 
> Now when it comes to testing, many claim it is important and it matters.
I'm
> not convinced. And I don't think people are looking at this in any
formality,
> it's more like religion, and its utility is to improve comfort-to-deploy
in the
> organisation, it doesn't do much towards probability-of- success in my
mind.
> I've worked for companies who test not at all, companies who boot it in
the
> lab and companies who had a team doing just testing, and I can't say I've
> seen different amounts of TAC cases on software issues.
> 
> People who invest lot on testing, and are not comfortable with idea that
> value is just 'comfort-to-deploy' (that may be sufficiently important
value), I
> recommend looking at TAC cases you had which actually did cause customer
> outage, then try to evaluate 'was this reasonable to catch in the lab',
try to be
> honest.
> The problem I see, whole NOS quality is shit, it's not so shit that it's
always
> broken, the problems that manifest require usually more than one
condition,
> then if you start to do back-of-the-envelope math on testing everything
with
> every permutations, you will notice no amount of money can fix the fact
that
> you're limited by heat-death-of-universe on the wall clock. So now you're
> venturing into an area where you gotta choose, what to test and what not
to
> test, and you don't have nearly enough outages and faults to apply
statistical
> analysis on it, so you're actually just guessing.
> It's religion, which has some utility, but not the utility we think it
has.
> 
> 
> Note I'm not saying testing wholesale is useless, I'm more saying it has
an
> exponentially or worse diminishing return. I would say push 1 packet
through
> all your products in the lab, and you're done, you're as far as you're
> reasonably gonna get.
> And start thinking in terms 'the NOS is shit and I exercise no power over
it',
> what actions work in that world? Staging pop with real but outage
insensitive
> subscriptions?
> 
> 
Food for thoughts indeed,
With regards to the infinite amount of failure cases and inability to test
for all of these,
Actually the name of the game is "what is the minimum number of features you
can get away with while realizing all your business cases".  
so while the search space is infinite I'd say the graph starts with a big
and narrow peak followed by an infinitely long tail (x axes = number of all
possible failure cases, y axes= probability) 

Also testing can also be divided into functional, performance and scaling
testing.
So you take your minimum number of features you can get away with and
perform
(I advise to perform these as multidimensional testing -all features
combined) 

Functional testing
-to see if the features actually work as intended and work when combined 
-this is where you'll find some bugs and can adjust your feature-set
accordingly  (yes the search space still equals to infinity albeit smaller
infinity than complete search space containing all features)

Performance testing
-only related to HW/HW-related features
-to figure out your pps/bps rate head room -weird stuff happens if you cross
certain thresholds (this shields you from potential known and unknow failure
cases and bugs)

Scaling testing
-related to HW and SW resources of the box
-to figure out your features scale and derive your  head room -weird stuff
happens if you cross certain thresholds (this shields you from potential
known and unknow failure cases and bugs)


adam




More information about the juniper-nsp mailing list