[j-nsp] BGP route-reflection question
'Dmitri Kalintsev'
dek at hades.uz
Fri May 30 12:45:48 EDT 2003
Hi Martin,
I guess now I should go back to the issue that I had that prompted me with
the good side of being able to disable to use of cluster-ids. Consider the
following configuration:
RR1---RR2
\ /
\C1/
+----Important LAN---
Links RR1 - RR2 is POS, both links to C1 is GigE, and C1 is a L3 switch.
Both RR1 and RR2 provide independent links to the rest of the network.
Now to the configs (sorry, but all configs in the example will be cisco):
RR1:
---
int lo0
ip add 10.0.0.1 255.255.255.255
!
int pos1/0
ip add 1.1.1.1 255.255.255.252
!
int Gig2/0
ip add 2.2.2.1 255.255.255.0 <= NOTE THE /24 mask
no ip proxy-arp
!
router ospf 1
network 1.1.1.0 0.0.0.255 area 0
!
router bgp 11111
bgp cluster-id 10
neigh 10.0.0.2 remote-as 11111
neigh 10.0.0.2 upda Lo0
RR2:
---
int lo0
ip add 10.0.0.2 255.255.255.255
!
int pos1/0
ip add 1.1.1.2 255.255.255.252
!
int Gig2/0
ip add 2.2.2.2 255.255.255.0 <= NOTE THE /24 mask
no ip proxy-arp
!
router ospf 1
network 1.1.1.0 0.0.0.255 area 0
!
router bgp 11111
bgp cluster-id 10
neigh 10.0.0.1 remote-as 11111
neigh 10.0.0.1 upda Lo0
C1:
---
int lo0
ip add 10.0.0.3 255.255.255.255
!
int VLAN10
ip add 2.2.2.3 255.255.255.0 <= NOTE THE /24 mask
!
int VLANx
desc Important VLAN#1
ip add <whatever>
Requirement:
C1 connects a few networks in quite an important LAN to the internet via
RR1/RR2. iBGP is a requirement between the C1 and RR1/RR2 (there is another
exit from this LAN, so static routing in from RR1/RR2 is not an option), and
there is NO dynamic routing between RR1/RR2 and C1, nor it's a good idea to
configure it (so presume - static routing ONLY plus iBGP). The C1 also talks
to some other L3 switches in the Important_LAN via OSPF.
Yes, I know that this situation is a nasty stockpile of recipies for
disaster, but that's what I had to deal with.
Um, one more complication: you can't change cluster-id's or delete them.
I'll be very interested to see the elegant solution to this (although now
obsolete, thanks God!) situation.
SY,
--
D.K.
On Thu, May 29, 2003 at 08:58:48PM -0400, Martin, Christian wrote:
> Dmitri,
>
> Two points that must be made clearer, in my view.
>
> 1) Cluster id's are required to prevent looping in hierarchical RR designs,
> regardless of whether or not the clients are originator_id-aware. Since the
> RRs will not match the originator_id's they will not be able to tell that
> they have already reflected a route. This would be analogous to saying "I
> wish Juniper didn't require my ASN to be prepended to the AS PATH." There
> are reasons - good ones - based on well know DV mechanisms that require
> enforcement of these rules. Sure, if you hate Split Horizon you can disable
> it in RIP, but you better have a love for loops! Since IBGP has no path
> state, is recursive to the endpoints, and is largely unaware of the
> loopiness (or lack thereof) of the underlying transport, certain limitations
> are imposed to ensure NLRI loop-freedom. One is full mesh, or more
> accurately, update non-transitivity. Break this rule, and there is nothing
> in IBGP that will tell a generic BGP speaker that propogated an external
> route that it just learned that same route from an internal peer. RRs
> "bend" this rule, but impose new ones. Break them at your own risk!
>
> 2) Danny made a subtle, but very important point - one that those of us who
> were overambitious with our intracluster RR redunancy goals where sad to
> learn for ourselves. The more reflection performed to the same client, the
> more routes it must store/process/index/sort/match against policy/etc. If
> you have three RRs, then the major current implementations will store all
> three of the same thing (with there being enough difference that it forces
> BGP to package them differently, but doesn't affect the decision process all
> that much). This could mean 330,000 routes for a singly-connected, full BGP
> feed. With two full feeds, this number can double - and so on. At what
> point does your router run out of gas? Do you want your router to have to
> store, package, sort, index, scan, etc 220,000 routes or 660,000? Which do
> you think is easier? How much redunancy do you need?
>
> I say these things because I have lived them. Direct iBGP sessions have
> little utility compared to Lo0-Lo0 peerings. If you have the latter, than 2
> RRs with the same cluster id should be all you need for a router with degree
> 2 (two uplinks). Anything more provides more pain than pleasure...
>
> Just my .02
>
> -chris
>
> PS
>
> The guys who "invented" RR were pretty thorough in exploring most of these
> issues. In fact, with the exception of persistent oscillation (more a MED
> prob than RR/confed), there are no known issues (outside of abstract,
> loosely applied theory and misconfig/buggy code or load/processing
> pathologies) that are known to cause loops or divergence of an iBGP network.
> And its been a few years since the first RR draft was posted! ;)
>
>
>
> > -----Original Message-----
> > From: Dmitri Kalintsev [mailto:dek at hades.uz]
> > Sent: Thursday, May 29, 2003 8:12 PM
> > To: juniper-nsp at puck.nether.net
> > Subject: Re: [j-nsp] BGP route-reflection question
> >
> >
> > Hmm, this has turned out to be a somewhat
> > hotter-than-anticipated discussion, so I went to the source,
> > as any good Luke would. The RFC2796
> > says:
> >
> > "In a simple configuration the backbone could be divided into many
> > clusters. Each RR would be configured with other RRs as
> > Non-Client peers
> > (thus all the RRs will be fully meshed.). The Clients will
> > be configured
> > to maintain IBGP session only with the RR in their
> > cluster. Due to route
> > reflection, all the IBGP speakers will receive reflected routing
> > information."
> >
> > So, having a client talking to two RRs in different clusters
> > contradicts this RFC. We're back to the square one.
> >
> > What I want to say is that in an ideal world I would have
> > appreciated the ability NOT to set the cluster ID, reverting
> > back to the originator-id loop detection mechanism. I think
> > that the network designer should be given the right to choose
> > his own poison, and feel that the way Juniper's config
> > imposes the use of cluster-ids when configuring an RR client
> > is a weeny bit pushy. ;^P
> >
> > Just my 2c.
> > --
> > D.K.
> >
> > On Thu, May 29, 2003 at 09:25:48AM +0100, Guy Davies wrote:
> > >
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > Hi Dmitri,
> > >
> > > I have to say that I don't necessarily *recommend* using different
> > > cluster IDs in the same cluster. I merely said that it is
> > a means to
> > > achieving what you wanted. I knew that Hannes specifically and
> > > possibly Juniper generally recommends doing this but I am
> > with Danny
> > > on this and personally recommend using the same cluster ID
> > and doing
> > > all iBGP from lo0 to lo0. IMHO, using different cluster IDs
> > wins you
> > > little in a well structured network and can cost you a lot (as
> > > described by Danny).
> > >
> > > No offence intended Hannes :-)
> > >
> > > Regards,
> > >
> > > Guy
> > >
> > > > -----Original Message-----
> > > > From: Danny McPherson [mailto:danny at tcb.net]
> > > > Sent: Thursday, May 29, 2003 1:05 AM
> > > > To: juniper-nsp at puck.nether.net
> > > > Subject: Re: [j-nsp] BGP route-reflection question
> > > >
> > > >
> > > > On 5/28/03 5:23 PM, "'Dmitri Kalintsev'" <dek at hades.uz> wrote:
> > > >
> > > > > P.S. I've noticed yesterday that the other vendor is
> > now also says
> > > > > that having more than one RR in the same cluster is "not
> > > > > recommended". *Sigh*, the world has changed, hasn't it? ;)
> > > >
> > > > Folks should be careful here, I'm not sure that this is truly a
> > > > "recommended" design, per se, as it can effect lots of things
> > > > significantly. For example, less optimal BGP update packing and
> > > > subsequently, slower convergence & much higher CPU resource
> > > > utilization, etc... In addition, it increases Adj-RIB-In
> > sizes [on
> > > > many boxes] and can have a significant impact on steady
> > state memory
> > > > utilization. Imagine multiple levels of reflection or
> > more than two
> > > > reflectors for a given cluster, etc.. The impact of
> > propagating and
> > > > maintaining redundant paths with slightly different attribute
> > > > pairings, especially in complex topologies, should be heavily
> > > > weighed.
> > > >
> > > > What I'd _probably recommend is a common cluster_id for all RRs
> > > > withing a cluster, a full mesh of iBGP sessions between
> > clients and
> > > > loopback iBGP peering everywhere such that if the
> > client<->RR1 link
> > > > fails there's an alternative path for the BGP session via
> > RR2 (after
> > > > all, the connectivity is there anyway) and nothings disrupted.
> > > > There are lots of other variable to be considered as
> > well, but IMO,
> > > > simply using different cuslter_ids isn't a clean solution.
> > > >
> > > > -danny
> > ---end quoted text--- _______________________________________________
> > juniper-nsp mailing list juniper-nsp at puck.nether.net
> > http://puck.nether.net/mailman/listinfo/junipe> r-nsp
> >
---end quoted text---
--
D.K.
More information about the juniper-nsp
mailing list