Here's an updated version of my draft on single-router BGP
convergence. Does this belong more in IRTF-RR with the global
convergence work or in BMWG? The intention of this paper is to
complement the global BGP convergence
work in IRTF-RR, and the router throughput methodology in BMWG.
Network Working Group H. Berkowitz
Internet-Draft Nortel Networks
January 2001
Benchmarking Methodology for Exterior Routing Convergence
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2001). All Rights Reserved.
Abstract
This document defines a specific set of tests that vendors can use to
measure and report the convergence performance of BGP-4 processes. It does
not consider the forwarding performance of such routers once they have
converged. A separate document will define convergence in interior routing.
This memo will consider changes in forwarding performance while a router
is reconverging, but RFC 2544 remains the methodology document for
benchmarking forwarding performance.
1. Introduction
This document defines a specific set of tests that implementers can use to
measure and report the convergence performance of BGP routers. It does
not consider the forwarding performance of such routers once they have
converged, with the caveat that the effect of the reconvergence process on
forwarding performance can be considered.
.
Indeed, the techniques here are appropriate for pure route servers
as well as
for devices that do both path determination and packet forwarding. The
results of these tests will provide the user comparable
data from different vendors with which to evaluate these devices. RFC
2544 remains the methodology document for forwarding performance.
Labovits, Ahuja, et al have done, and are continuing to do, valuable
work on Internet-wide convergence. Their measurements, however,
reflect a wide range of factors affecting convergence, including
media speeds, propagation times, policies, etc. Whenever possible,
terminology in
this document is consistent with Labovits et al.
The presentation does not formalize the definition of convergence,
but, in any case, there appear to be several useful meanings of "BGP
convergence time." Lack of standard terminology leads both to
difficulty in comparing research results, and generating FUD for
Internet operators and consumers.
Existing benchmarking documents, such as RFC 2544, focus on
forwarding performance rather than convergence.
2. Requirements
In this document, the words that are used to define the significance
of each particular requirement are capitalized. These words are:
* "MUST" This word, or the words "REQUIRED" and "SHALL" mean that
the item is an absolute requirement of the specification.
* "SHOULD" This word or the adjective "RECOMMENDED" means that
there may exist valid reasons in particular circumstances to
ignore this item, but the full implications should be
understood and the case carefully weighed before choosing a
different course.
* "MAY" This word or the adjective "OPTIONAL" means that this
item is truly optional. One vendor may choose to include the
item because a particular marketplace requires it or because it
enhances the product, for example; another vendor may omit the
same item.
An implementation is not compliant if it fails to satisfy one or more
of the MUST requirements for the protocols it implements. An
implementation that satisfies all the MUST and all the SHOULD
requirements for its protocols is said to be "unconditionally
compliant"; one that satisfies all the MUST requirements but not all
the SHOULD requirements for its protocols is said to be
"conditionally compliant".
3. Workloads and Scenarios
Providing useful convergence information for BGP routers depends
significantly on the intended use of the router. Since workload,
principally the size of the full routing table and the number of BGP
peers, but also additional processing such as route filtering, flap
dampening, authentication, etc., will affect any router.
Not all BGP routers are intended for the same applications. This
section presents some representative scenarios, but, in practice, the
tester of a given router will need to develop workload parameters
that are appropriate for the intended purpose. The goal of this
specification is not to prescribe numeric values for these
parameters, but simply to identify the parameters and require them to
accompany a compliant test report.
A given test report must include:
Number of routes to be in the device under test's (DUT) converged
routing table
Number of eBGP peers
For each peer, the number of routes to be received and to be advertised
Number of iBGP peers
The number of routes will vary with the proposed application.
Realistic numbers should be based on the size of a current
default-free routing table (exclusive of internal routes). This
table is referred to as DFRT and the number of routes it contains a
NDFRT. It is the Routing Information Base (RIB) of the DUT.
Depending on the router implementation, one or more Forwarding
Information Bases (FIB) may need to be generated from the RIB before
a router can advertise and forward at full speed.
Be aware that many service providers will have substantial numbers of
internal and non-aggregated customer routes, so the routing table of
a large provider's core router could very well contain 1.5 NDFRT or
more routes. Smaller RIBs may be used with routers explicitly
intended for edge use with defaults, and the assumptions cited.
Appendix A presents some scenarios for typical BGP applications.
4. Types of Convergence
Two significantly different types of convergence time tend to be
lumped together in product specifications. The first is the time
needed for a BGP speaker to build a full table after initialization,
or for a particular peering session to rebuild its table after a hard
reset. The second is the time needed for a router to respond to a new
announcement or withdrawal.
4.1 Reference Configuration
For tests when the number of peers is not a performance parameter of
interest, use the configuration in Figure 1:
TR1==========+---------+==========TR3
| | |
D1 | |
| | DUT |
TR2==========| |
+---------+
D1 is a prefix reachable by both TR1 and TR2. It is assumed that
neither TR1 or TR2 is the originating AS for the announcement of D1.
More complex peering arrangements will involve up to n Test Routers,
as shown in Figure 2. It is recommended that the Figure 1
configuration always be tested as a baseline, and then additional
reports made that show the effect on performance of increasing the
number of peers.
TR1==========+---------+==========TR3
| | |
D1 | |
| | DUT |
TR2==========| |
| |
...
TRn==========+---------+
Interface speeds will be specified as part of the test report. At
least 100 Mbps is recommended, so media delays are not a signficant
component of the convergence time.
In the absence of other route selection criteria, TR1 shall have an
IP address that makes it most preferred.
4.3 Events in the Convergence Process
[Ahuja 2000a] defines the events:
Tup -- A new route is advertised
Tdown -- A route is withdrawn (i.e. single-homed failure)
Tshort -- Advertise a shorter/better ASPath (i.e. primary path repaired)
Tlong -- Advertise a longer/worse ASPath (i.e.primary path fails)
In this paper, the meaning of Tup and Tdown are preserved and
extended from [Ahuuja]. The notation Tup(TRx) means a Tup event
advertised to the router being tested (i.e., DUT).
The sense of the Tshort and Tlong events is also preserved, but the
basic criterion for selecting a "better" route is the final
tiebreaker defined in RFC1771, the router ID. As a consequence, this
memorandum uses the events Tbetter, Tworse, and Tbest.
While ASPath is quite likely to be the most common tiebreaker in the
operational Internet, it is not actually part of the RFC-defined
route preference algorithm. AS path prepending is another widely used
but nonstandard factor for influencing route preference, but
questions have been raised regarding its scalability in an
ever-growing Internet.
5. Measurement
Measurements can be defined either as internal or external. Internal
measurements examine the RIB/FIB of the DUT. While they are more
accurate in principle, they require measurement hooks in the
implementation, as described in [Trotter].
External measurements start with a stimulus from one or more
"upstream" routers and end with a specific event causing an
advertisement to be sent to a "downstream" peer. In the reference
configuration above, external measurements are defined with respect
to TR3 as the downstream router.
6. eBGP tests
All routers in this configuration have a policy of ADVERTISE
ALL/ACCEPT ALL [RPSL]. Tests with prefix filtering, community-based
preferences, authentication, etc., as well as performance under flap
are TBD.
Not all eBGP applications are alike. While the tests in this section
are applicable to a wide range of configurations, testers may select
configurations that are most relevant to the intended product use.
Such configurations include:
1. Interprovider peering, characterized by an exchange of customer routes,
which, in the case of major providers, may be in the tens of thousands
of routes but smaller than the default-free table.
2. Transit services, where the transit customer advertises a relatively
small number of routes toward the provider, but variously may take
full default-free routes, customer routes, or default only from the
provider.
6.1 eBGP Initial Convergence
While this is relatively simple to measure, and often is the basis of
product specifications, it is operationally far less significant than
reconvergence after changes. A "carrier-grade" router should not
initialize often, and the soft reset option reduces the need to
rebuild views. The initialization time, therefore, can be amortized
over a long period of time and may disappear into the noise when
compared to reconvergence.
6.1.1 Initial Convergence Time
The test begins with OPEN requests sent from TR1 and TR2 to the DUT.
Each Test Router sends a standard routing table of TBD routes.
The test ends when the DUT begins to advertise the last route in the
routing table to TR3.
6.2 eBGP Reconvergence
For all of these measurements, report any route filters,
authentication, and reverse path verification used.
6.2.1 Time to Add Newly Advertised Route
The DUT has been initialized, with no path to D. Measurement time
begins when TR1 announces D to the DUT.
6.2.1.1 Time to Readvertise D
Measurement time stops when the DUT advertises D to TR3.
6.2.1.2 Time to Begin Forwarding to D
Prior to TR1 advertising D, TR2 attempts to forward to TR3 via the
DUT. Measurement time ends when TR3 receives a TR1-originated packet
via the DUT.
6.2.2 Time to Change to Alternate Path after Withdrawal
The DUT has been initialized and has paths to D via both TR1 and TR2.
TR1's path is preferred, but TR1 withdraws it with TDown(TR1).
Reconvergence occurs when the TR2 advertised paths becomes active.
6.2.2.1 Time to Readvertise D
Measurement time stops when the DUT advertises D to TR3.
6.2.2.2 Time to Begin Forwarding to D
Prior to TR1 advertising D, TR2 attempts to forward to TR3 via the
DUT. Measurement time ends when TR3 receives a TR1-originated packet
via the DUT.
6.2.3 Time to Reconverge after Sequential Withdraw and New Announcement
The DUT has been initialized and has a path to D1 via TR1, not TR2.
Simultaneously, TR1 sends TDown(TR1) and TR2 announces the new route
with Tbest(TR2).
6.2.3.1 Time to Readvertise D
Measurement time stops when the DUT advertises D to TR3.
6.2.3.2 Time to Begin Forwarding to D
Prior to TR1 advertising D, TR2 attempts to forward to TR3 via the
DUT. Measurement time ends when TR3 receives a TR1-originated packet
via the DUT.
7. iBGP
7.1 Mesh tests
Repeat the topologies of step 5, but within the same AS. The test
report shall show the specific test configuration(s). It is highly
desirable that the result show the effect of increasing the number of
peers on routing performance.
7.2 Route Reflector tests
TR1==========+---------+==========TR3
| | |
D1 | |
| | DUT |
TR2==========| |
| |
...
TRn==========+---------+
7.2.1 DUT as Route Reflector
The DUT acts as the cluster server in a single-server cluster. Let
TR1 and TR2 be clients of the DUT, and repeat the tests of step 5.
7.2.2 DUT Route Reflector in multiple reflector cluster
The DUT acts as one of the the clusters server in a multi-server
cluster. TRn will be the additional server. There will be iBGP
peering between TRn and DUT, between DUT and TR1, between TRn and
TR1, between DUT and TR2, and between TRn and TR2. Let TR1 and TR2 be
clients of the DUT, and repeat the tests of step 5.
7.2.3 DUT as Route Reflector Client
The DUT acts as a client in a single-server cluster. Let TR1 be the
cluster reflector. TR2, and additional routers as desired, serve as
clients. Test results shall state the number of clients.
7.2.4 DUT as Route Reflector Client in multiple reflector cluster
The DUT acts as one of the the clients in a multi-server cluster. TRn
will be the additional server. There will be iBGP peering between
TR1 and TRn, between DUT and TR1, between DUT and TRN, between TR2
and TR1, and between TR2 and TRN.
8. Modifiers
It might be useful to know the DUT performance under a number of
conditions; some of these conditions are noted below. The reported
results SHOULD include as many of these conditions as the test
equipment is able to generate. The suite of tests SHOULD be first
run without any modifying conditions and then repeated under each of
the conditions separately.
8.1 Filters
8.1.1 Representative Customer Ingress Filtering
Following the principles of [RFC 2827], perform the eBGP tests with a
filter to accept a single prefix from TR1, while being sent a
10-route table and a full (TBD) table.
8.2. Bursty traffic/route flap
Let TRF be a router that will generate only flapping routes.
TR1==========+---------+==========TR3
| | |
D1 | |
| | DUT |
TR2==========| |
| |
...
TRF==========+---------+
8.2.1 Flap Isolation Test
TRF will advertise a continuously flapping route. Repeat the eBGP
convergence tests.
8.2.2 Flap Rejection Tests
Repeat eBGP Reconvergence Tests while one route in the TR1 peering
flaps continuously.
8.3 Communities
8.3.1 Community-based Acceptance
Perform the eBGP tests with a filter to accept TBD prefixes tagged
with community XXX, sent as part of a full (TBD) table.
8.3.2 Community Advertising
Perform the eBGP advertising tests but adding a community YYY.
9. Security Considerations
Security issues are not addressed in this document.
10. Acknowledgements
Thanks to Francis Ovenden for review and Abha Ahuja for encouragement.
11. References
[Ahuja 2000a] "An Experimental Study of Delayed Internet Routing
Convergence." Abha Ahuja, Farnam Jahanian, Abhijit Bose, Craig
Labovits, RIPE 37 - Routing WG.
[RFC 2539] "BGP Route Flap Damping" C. Villamizar, R. Chandra, R.
Govindan. November 1998.
[RFC 2544] "Benchmarking Methodology for Network Interconnect
Devices." S. Bradner, J. McQuaid. March 1999.
[RFC 2622] Routing Policy Specification Language (RPSL)." C.
Alaettinoglu, C. Villamizar, E. Gerich, D. Kessens, D. Meyer, T.
Bates, D. Karrenberg, M. Terpstra. June 1999.
[RFC 2827] Network Ingress Filtering: Defeating Denial of Service
Attacks which employ IP Source Address Spoofing. P. Ferguson, D.
Senie. May 2000.
[RFC 2928] "Route Refresh Capability for BGP-4". E. Chen.
[Trotter] "Terminology for Forwarding Information Based (FIB)
based Router Performance Benchmarking", Work in Progress, IETF
draft-ietf-bmwg-fib-term-00.txt
12. Author's Address
Howard Berkowitz
Nortel Networks
5012 S. 25th St
PO Box 6897
Arlington VA 22206
Phone: +1 703 998-5819 (ESN 451-5819)
Fax: +1 703 998-5058
EMail: hberkowi@nortelnetworks.com
hcb@clark.net
Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
This archive was generated by hypermail 2b29 : Mon Aug 04 2003 - 04:10:04 EDT