[j-nsp] RE2/RE3 and unwarranted reboots

Hannes Gredler hannes at juniper.net
Fri Jul 1 11:57:01 EDT 2005


On Fri, Jul 01, 2005 at 02:10:13PM +0300, Pekka Savola wrote:
| Thanks for your constructive mail, Hannes.
| 
| On Fri, 1 Jul 2005, Hannes Gredler wrote:
| >| Based on the technical specs, 2.0 is for most purposes equivalent to
| >| 3.0.  Users don't expect to see a difference, and frankly, I don't
| >| think they should.
| >
| >that a bit too much of assumption:
| >supporting two REs with a different clockspeed opens up two dozens of
| >testcases [ =! "some testing"] just for e.g. graceful RE switchover.
| > it is not _just_ the change in CPU speed ...
| > you change a variable in a complex system interaction -
| > so you either have to fully test it or don't support it;
| 
| I'm not sure if I understand this.  Basically what might change is the 
| amount of memory, CPU speed, HDD space, and similar factors.  Both 
| would still be the same hardware, externally visible the same way, and 
| run the same kernel and software.
| 
| In the next mail, you mentioned issues wrt. kernel nexthops etc. -- as 
| above, I can't figure out how these should be a factor here.  The 
|        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| master RE reboots immediately, even before the slave has been halfway 
| to booting up (I recall it wasn't even done with BIOS yet).  This 
| makes me *really* wonder what the logical interface between the REs 
| is..

the amount of state bound to an RE and the protocols to replicate
said state to another RE are non-trivial [do not just think of an M10i
with dual REs - think of a fully blown TX routing matrix with 10 REs in
the cluster and try to imagine the state replication firework upon
switchover];

again, changing an important property in the equation [CPU speed of the
peer RE] might unveil cases where the parts of the replication process
gets broken;

i do not want to provide specifics but troubleshooting replicant
problems is hard enough, and upon every problem we find we have
to realize that this is an issue that we really never thought
could happen, but eventually it did happen - so if your assertion is
that unlike REs "should not make a difference" empirical evidence is
telling the contrary;

| [ ... ]
|
| > i am deeply convinced that a honest "this is not supported" decision
| > is better than broken, half-hearted regressed software or
| > extra $$$ on the pricetag to support PC hardware that is
| > not available on the open market anymore;
| 
| I wonder if there has been a consideration of getting older RE models, 
| but with a higher pricetag -- for those that think really need them, 
| so they could evaluate the impact of the pricetag themselves.

choice is always good - however sometimes just offering a choice
that nobody picks has also an imposed long-term cost:

it is a market reality that as soon you have the faster/bigger gizmo on offer,
sales figures tend to drop sharp on the prev. generation and after some time
[e.g. when sales goes down to a few units per month] it is just legitimate
to discontinue the older generation ...

/hannes


More information about the juniper-nsp mailing list