[j-nsp] 10.0 or 10.4?

Derick Winkworth dwinkworth at att.net
Tue Mar 15 13:03:40 EDT 2011


We are running 10.0S9 right now.  10.0S10 introduced a bug that leaves the CPU 
running at 100% on our M-series, and this bug is resolved in 10.0S13 which I 
think is out already.

We haven't put 10.0S13 in production yet, but I suspect that this will be as 
close we will get to a bug-free release for the time being...




________________________________
From: Richard A Steenbergen <ras at e-gerbil.net>
To: Chris Kawchuk <juniperdude at gmail.com>
Cc: juniper-nsp <juniper-nsp at puck.nether.net>
Sent: Tue, March 15, 2011 11:14:06 AM
Subject: Re: [j-nsp] 10.0 or 10.4?

On Tue, Mar 15, 2011 at 07:43:25PM +1100, Chris Kawchuk wrote:
> Just installed 14 x MX960s for a large Aussie Mobile company - The 
> release train we've decided on is 10.4R2 for now, due to EEOL support; 
> and the fact that 10.0 didn't support a few of the cards we added. 
> (16x10GE Trio for example didn't come till 10.2).

I hear people make this argument a lot, but in many cases it seems to be 
more of a knee-jerk reaction than a logical decision. The EEOL branches 
are definitely interesting once you get into the post-R4 timeframe, but 
I question the sensibility of trying to deploy it in the R2 timeframe 
just because it is the EEOL train.

Honestly, in many cases the code doesn't even begin to get stable until 
it reaches R4 and EOL status. The problem we run into is that we almost 
always discover at least one serious bug in every R4, no matter how 
well-intentioned the development efforts, but because R4 marks the end 
of engineering status we're constantly chasing the next branch to get a 
bugfix for things introduced in the previous branch. Of course what that 
really means is we discover all new brokeness in the new branch, and the 
cycle starts all over again. EEOL releases can end up being a lot more 
stable since you aren't introducing any new features (though anyone who 
tells you they don't introduce a ton of new bugs just doing service 
releases is completely full of it :P), so they're a good thing.

But, what is the real benefit to deploying 10.4R2 now, as opposed to say 
10.3R3? Either way you'll have to do an upgrade later on, so until you 
get to 10.4R4 there is no difference in 10.4 being the EEOL branch. We 
recently spent a fair bit of time trying to decide between 10.3R3 and 
10.4R2 for a lot of MX960 and EX8200 upgrades, and came to the 
conclusion that 10.4R2 was significantly buggier. Why JTAC is 
recommending it I can't even begin to guess, I really think they have 
the recommended version page hooked up to a random number generator some 
days, but in our testing it wasn't even close.

Which isn't to say 10.3R3 is perfect, but it's definitely on the "less 
broken than ever" side of things. So far we haven't had any issues with 
Trio hardware or snmp problems that we saw in 10.2R3 or 10.3R2, and if 
you carry a large number of BGP routes with communities you'll see some 
significant performance gains in policy evaluation which can improve 
convergence times quite a bit. Off the top of my head some issues we've 
seen with 10.3R3 so far are:

* Syslogging of BGP messages seems quite broken, in many cases not 
logging state changes correctly at all.
* ISIS packets inside a l2circuit are eatten by MX's when vlan-mapping 
is configured on the endpoint vlan-ccc.
* EX8200 power supplies will think they're running in 1200W 110V input 
power mode if you reinsert them after a reboot, even if fed with 220V 
power which should run them in 2000W mode. This will cause cards to 
power down if the chassis thinks there is insufficient power, so you may 
not have proper power supply redundancy.

No doubt there are plenty more too, but at least in a core service 
provider role it's been a lot less bad (lets just say its nice to not 
have to hard clear bgp neighbors to make policy changes take effect :P).

> I have also hear that 10.4 also included a mass 
> re-write/re-development of a lot of the JunOS code; trying to bring it 
> back within a manageable framework. (Note how it went from 10.2R3 to 
> 10.4, skipping a 10.3 release for some platforms). Hence, 10.4 is the 
> "new" code base. I don't know if this is a good thing or a bad thing 
> initially, but should only improve with time.

Actually it's the opposite, 10.3 and 10.4 were both "nobody touch 
anything that isn't essential" no-feature releases, to try and bring the 
development framework into a more manageable state. I'll confirm that 
they're less broken than 10.2, but that certainly doesn't take much. :)

-- 
Richard A Steenbergen <ras at e-gerbil.net>      http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
_______________________________________________
juniper-nsp mailing list juniper-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


More information about the juniper-nsp mailing list