[j-nsp] BGP route-reflection question

Guy Davies Guy.Davies at telindus.co.uk
Thu May 29 23:04:53 EDT 2003


 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Hannes,

> -----Original Message-----
> 
> On Thu, May 29, 2003 at 09:14:18AM -0600, Danny McPherson wrote:
> | On 5/29/03 2:18 AM, "Hannes Gredler" <hannes at juniper.net> wrote:
> | 
> | > most of the BGP scaling properties are bound to memory 
> size, you are right:
> | > 
> | > however, what i fail to see is why path diversity is 
> negatively impacting
> | > convergence;
> | > what i have seen to far is contrary: a healthy path 
> diversity speeds
> | > up convergence; 
> | 
> | I think you're confusing the issue, what new diversity do 
> you get that
> | simple loopback peering wouldn't provide?   The 
> transmission substrate
> | doesn't change, you're simply adding lots of overhead unnecessarily,
> | subsequently effecting convergence, memory consumption and 
> CPU utilization,
> | in _any router.
> 
> no doubt simple loopback peering inside the lcuster does do the trick;
> however from an administration point of view, SPs in my theatre here,
> try to avoid the intracluster full mesh and better
> go with diverse cluster IDs in the same RR level; 

I may be missing something, but what does a full mesh within the cluster and
unique cluster IDs gain you over client-to-client reflection between each
client with each client peering with both RRs' loopbacks?  I cannot even
contrive a network design where this would improve convergence, stability or
functionality.

> | > of course many paths do cost memory; so the main challenge
> | > is to convince "the other vendor" to ship proper memory 
> with their boxes and
> | > not to tweak the design of the routing mesh to the 
> limitations of a single
> | > implementation;
> | 
> | It's not just about memory, or any particular vendor.  
> Although memory is
> | one factor (in which case by following your recommendation 
> you'll(J) use
> | more as well, no?), it's also about the protocol 
> capabilities.  If you send
> | 2x or 3x the amount of updates because they can't be packed 
> as efficiently
> | that effects the entire routing system -- not just a single 
> box -- although
> | every individual box is effected as well.  If receivers now 
> have to pack and
> | process twice as many updates, or aggregate Adj-RIBs-In are 
> much larger,
> | that eventually effects the characteristics of the entire 
> routing system.
> 
> indeed it is; you are assumming that it is double the amount 
> of processing load,
> however that entirely depends on the implementation; i.e. how 
> the system internally
> maintains its path and prefix structures; 

I don't think Danny said that there would be twice the processing load.  He
said there would be twice as many updates.  As you say, how linear the
processing load is relative to the number of updates received is entirely
implementation dependent.

> | > typically the design lives much longer than a single 
> boxes' lifespan ;-)
> | 
> | Indeed, and that's why you, as well as any other vendor, 
> should be concerned
> | with the effects that recommendations you make have on the 
> larger routing
> | system.   This isn't about Cisco, Juniper or 
> insert_vendor_here, don't make
> | it.  It's about clean network architecture, something that 
> will effect not
> | only the local routing system, but distant networks as well.
> 
> don't get me wrong;
> i did not want to ride the "lets burn memeory, coz we have 
> lots of them" wave
> its simply that too often i have seen network architectures 
> being built around the
> limitations of a 20K$ box; the extra 10% of memory that 
> diverse cluster IDs do 
> cost IMHO outweight the administration cost of maintaining 
> the intracluster
> full-mesh; if a box does not stand that extra few MBs then it 
> should not belong
> in the core;

An intracluster full-mesh is nasty.  I'd avoid that if at all possible.
However, I still don't understand why you can't get the necessary resilience
and performance with a client-to-client reflection system where the RRs use
the same cluster ID.  Of course, peerings must be between lo0.

Regards,

Guy

-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0

iQA/AwUBPtZ1uo3dwu/Ss2PCEQJCVwCaA9V9zavqB3vZHVR4p6wO85pUAK0AoNcw
WjaG7/67lTVzNuPxdaVpCYWh
=k/I4
-----END PGP SIGNATURE-----


More information about the juniper-nsp mailing list