[j-nsp] BGP route-reflection question
Clayton Fiske
clay at bloomcounty.org
Thu May 29 20:20:45 EDT 2003
Wouldn't the solution be to use Lo0 to Lo0 peering, as you said?
Then you don't have to worry about the cluster-id problem in the
first place...
-c
On Fri, May 30, 2003 at 11:45:48AM +1000, 'Dmitri Kalintsev' wrote:
> Hi Martin,
>
> I guess now I should go back to the issue that I had that prompted me with
> the good side of being able to disable to use of cluster-ids. Consider the
> following configuration:
>
> RR1---RR2
> \ /
> \C1/
> +----Important LAN---
>
> Links RR1 - RR2 is POS, both links to C1 is GigE, and C1 is a L3 switch.
> Both RR1 and RR2 provide independent links to the rest of the network.
>
> Now to the configs (sorry, but all configs in the example will be cisco):
>
> RR1:
> ---
> int lo0
> ip add 10.0.0.1 255.255.255.255
> !
> int pos1/0
> ip add 1.1.1.1 255.255.255.252
> !
> int Gig2/0
> ip add 2.2.2.1 255.255.255.0 <= NOTE THE /24 mask
> no ip proxy-arp
> !
> router ospf 1
> network 1.1.1.0 0.0.0.255 area 0
> !
> router bgp 11111
> bgp cluster-id 10
> neigh 10.0.0.2 remote-as 11111
> neigh 10.0.0.2 upda Lo0
>
> RR2:
> ---
> int lo0
> ip add 10.0.0.2 255.255.255.255
> !
> int pos1/0
> ip add 1.1.1.2 255.255.255.252
> !
> int Gig2/0
> ip add 2.2.2.2 255.255.255.0 <= NOTE THE /24 mask
> no ip proxy-arp
> !
> router ospf 1
> network 1.1.1.0 0.0.0.255 area 0
> !
> router bgp 11111
> bgp cluster-id 10
> neigh 10.0.0.1 remote-as 11111
> neigh 10.0.0.1 upda Lo0
>
>
> C1:
> ---
> int lo0
> ip add 10.0.0.3 255.255.255.255
> !
> int VLAN10
> ip add 2.2.2.3 255.255.255.0 <= NOTE THE /24 mask
> !
> int VLANx
> desc Important VLAN#1
> ip add <whatever>
>
> Requirement:
>
> C1 connects a few networks in quite an important LAN to the internet via
> RR1/RR2. iBGP is a requirement between the C1 and RR1/RR2 (there is another
> exit from this LAN, so static routing in from RR1/RR2 is not an option), and
> there is NO dynamic routing between RR1/RR2 and C1, nor it's a good idea to
> configure it (so presume - static routing ONLY plus iBGP). The C1 also talks
> to some other L3 switches in the Important_LAN via OSPF.
>
> Yes, I know that this situation is a nasty stockpile of recipies for
> disaster, but that's what I had to deal with.
>
> Um, one more complication: you can't change cluster-id's or delete them.
> I'll be very interested to see the elegant solution to this (although now
> obsolete, thanks God!) situation.
>
> SY,
> --
> D.K.
>
> On Thu, May 29, 2003 at 08:58:48PM -0400, Martin, Christian wrote:
> > Dmitri,
> >
> > Two points that must be made clearer, in my view.
> >
> > 1) Cluster id's are required to prevent looping in hierarchical RR designs,
> > regardless of whether or not the clients are originator_id-aware. Since the
> > RRs will not match the originator_id's they will not be able to tell that
> > they have already reflected a route. This would be analogous to saying "I
> > wish Juniper didn't require my ASN to be prepended to the AS PATH." There
> > are reasons - good ones - based on well know DV mechanisms that require
> > enforcement of these rules. Sure, if you hate Split Horizon you can disable
> > it in RIP, but you better have a love for loops! Since IBGP has no path
> > state, is recursive to the endpoints, and is largely unaware of the
> > loopiness (or lack thereof) of the underlying transport, certain limitations
> > are imposed to ensure NLRI loop-freedom. One is full mesh, or more
> > accurately, update non-transitivity. Break this rule, and there is nothing
> > in IBGP that will tell a generic BGP speaker that propogated an external
> > route that it just learned that same route from an internal peer. RRs
> > "bend" this rule, but impose new ones. Break them at your own risk!
> >
> > 2) Danny made a subtle, but very important point - one that those of us who
> > were overambitious with our intracluster RR redunancy goals where sad to
> > learn for ourselves. The more reflection performed to the same client, the
> > more routes it must store/process/index/sort/match against policy/etc. If
> > you have three RRs, then the major current implementations will store all
> > three of the same thing (with there being enough difference that it forces
> > BGP to package them differently, but doesn't affect the decision process all
> > that much). This could mean 330,000 routes for a singly-connected, full BGP
> > feed. With two full feeds, this number can double - and so on. At what
> > point does your router run out of gas? Do you want your router to have to
> > store, package, sort, index, scan, etc 220,000 routes or 660,000? Which do
> > you think is easier? How much redunancy do you need?
> >
> > I say these things because I have lived them. Direct iBGP sessions have
> > little utility compared to Lo0-Lo0 peerings. If you have the latter, than 2
> > RRs with the same cluster id should be all you need for a router with degree
> > 2 (two uplinks). Anything more provides more pain than pleasure...
> >
> > Just my .02
> >
> > -chris
> >
> > PS
> >
> > The guys who "invented" RR were pretty thorough in exploring most of these
> > issues. In fact, with the exception of persistent oscillation (more a MED
> > prob than RR/confed), there are no known issues (outside of abstract,
> > loosely applied theory and misconfig/buggy code or load/processing
> > pathologies) that are known to cause loops or divergence of an iBGP network.
> > And its been a few years since the first RR draft was posted! ;)
> >
> >
> >
> > > -----Original Message-----
> > > From: Dmitri Kalintsev [mailto:dek at hades.uz]
> > > Sent: Thursday, May 29, 2003 8:12 PM
> > > To: juniper-nsp at puck.nether.net
> > > Subject: Re: [j-nsp] BGP route-reflection question
> > >
> > >
> > > Hmm, this has turned out to be a somewhat
> > > hotter-than-anticipated discussion, so I went to the source,
> > > as any good Luke would. The RFC2796
> > > says:
> > >
> > > "In a simple configuration the backbone could be divided into many
> > > clusters. Each RR would be configured with other RRs as
> > > Non-Client peers
> > > (thus all the RRs will be fully meshed.). The Clients will
> > > be configured
> > > to maintain IBGP session only with the RR in their
> > > cluster. Due to route
> > > reflection, all the IBGP speakers will receive reflected routing
> > > information."
> > >
> > > So, having a client talking to two RRs in different clusters
> > > contradicts this RFC. We're back to the square one.
> > >
> > > What I want to say is that in an ideal world I would have
> > > appreciated the ability NOT to set the cluster ID, reverting
> > > back to the originator-id loop detection mechanism. I think
> > > that the network designer should be given the right to choose
> > > his own poison, and feel that the way Juniper's config
> > > imposes the use of cluster-ids when configuring an RR client
> > > is a weeny bit pushy. ;^P
> > >
> > > Just my 2c.
> > > --
> > > D.K.
> > >
> > > On Thu, May 29, 2003 at 09:25:48AM +0100, Guy Davies wrote:
> > > >
> > > > -----BEGIN PGP SIGNED MESSAGE-----
> > > > Hash: SHA1
> > > >
> > > > Hi Dmitri,
> > > >
> > > > I have to say that I don't necessarily *recommend* using different
> > > > cluster IDs in the same cluster. I merely said that it is
> > > a means to
> > > > achieving what you wanted. I knew that Hannes specifically and
> > > > possibly Juniper generally recommends doing this but I am
> > > with Danny
> > > > on this and personally recommend using the same cluster ID
> > > and doing
> > > > all iBGP from lo0 to lo0. IMHO, using different cluster IDs
> > > wins you
> > > > little in a well structured network and can cost you a lot (as
> > > > described by Danny).
> > > >
> > > > No offence intended Hannes :-)
> > > >
> > > > Regards,
> > > >
> > > > Guy
> > > >
> > > > > -----Original Message-----
> > > > > From: Danny McPherson [mailto:danny at tcb.net]
> > > > > Sent: Thursday, May 29, 2003 1:05 AM
> > > > > To: juniper-nsp at puck.nether.net
> > > > > Subject: Re: [j-nsp] BGP route-reflection question
> > > > >
> > > > >
> > > > > On 5/28/03 5:23 PM, "'Dmitri Kalintsev'" <dek at hades.uz> wrote:
> > > > >
> > > > > > P.S. I've noticed yesterday that the other vendor is
> > > now also says
> > > > > > that having more than one RR in the same cluster is "not
> > > > > > recommended". *Sigh*, the world has changed, hasn't it? ;)
> > > > >
> > > > > Folks should be careful here, I'm not sure that this is truly a
> > > > > "recommended" design, per se, as it can effect lots of things
> > > > > significantly. For example, less optimal BGP update packing and
> > > > > subsequently, slower convergence & much higher CPU resource
> > > > > utilization, etc... In addition, it increases Adj-RIB-In
> > > sizes [on
> > > > > many boxes] and can have a significant impact on steady
> > > state memory
> > > > > utilization. Imagine multiple levels of reflection or
> > > more than two
> > > > > reflectors for a given cluster, etc.. The impact of
> > > propagating and
> > > > > maintaining redundant paths with slightly different attribute
> > > > > pairings, especially in complex topologies, should be heavily
> > > > > weighed.
> > > > >
> > > > > What I'd _probably recommend is a common cluster_id for all RRs
> > > > > withing a cluster, a full mesh of iBGP sessions between
> > > clients and
> > > > > loopback iBGP peering everywhere such that if the
> > > client<->RR1 link
> > > > > fails there's an alternative path for the BGP session via
> > > RR2 (after
> > > > > all, the connectivity is there anyway) and nothings disrupted.
> > > > > There are lots of other variable to be considered as
> > > well, but IMO,
> > > > > simply using different cuslter_ids isn't a clean solution.
> > > > >
> > > > > -danny
> > > ---end quoted text--- _______________________________________________
> > > juniper-nsp mailing list juniper-nsp at puck.nether.net
> > > http://puck.nether.net/mailman/listinfo/junipe> r-nsp
> > >
> ---end quoted text---
>
> --
> D.K.
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> http://puck.nether.net/mailman/listinfo/juniper-nsp
More information about the juniper-nsp
mailing list