[Irtf-rr] Candidate sub-project - Dynamic Routing schemes and scale-free topologies

Thu Dec 4 02:00:15 EST 2003

A while back I sent this list a note saying I was going to work on 
establishing a few small sub-project to take a look at some of the open 
research issues in routing.  Since then, through conversations with 
many researchers and other interested folk, I have come up with a list 
of 13 topic possibilities.  The problem is now to narrow that list down 
to a few sub-projects that have sufficient traction in the research, 
and development, communities to form a decent research team.

The following write up, submitted by Dmitri Krioukov, seems a strong 
candidate as the seed of one such research sub-project devoted to 
research on dynamic routing schemes in scale-free graphs - e.g. the 
Internet.

My next step is to try and recruit some researchers, both senior and 
student, to tackle this and similar approachs.  I invite the 
participants on this list to read this write-up and to discuss it.  I 
also invite anyone who is interested in putting in some thought and 
work on this topic as a participant in such a sub-project to let me 
know.

thanks
a.

--------------

FDR Scalability: Introduction and Open Problems

1. Introduction

 From the distributed computation theory, it is known that the core of 
scalability concerns associated with routing on dynamic graphs can be 
rigorously reformulated in the form of the triangle of trade-offs 
between the routing table size, adaptation costs, and stretch. The term 
"trade-off" used here means that, for example, routing table size 
decrease comes with the price of increase of either the stretch or 
adaptation costs, or both.

Note that in distributed computing, the routing table size is sometimes 
referred to as the memory space, meaning the amount of memory required 
by a routing algorithm (also called scheme) either per node (so called, 
local space) or for the whole network (the total memory space).

Adaptation costs usually refer to the amount of messages generated by a 
routing scheme per topology change, or to the sizes of those messages, 
or to some similar parameters (like those measuring routing stability, 
etc.). For example, thanks to works of Tim Griffin et al., and to 
experimental observations, we know that the adaptation costs of today's 
BGP routing are not upper-bounded at all since there are scenarios when 
the routing path vector algorithm implemented by BGP does not converge 
but oscillates persistently.

It is widely known that the routing table size and adaptation costs are 
the real scalability problems with BGP routing in the Internet today. 
But what is the stretch?

The stretch is a multiplicative path length increase factor associated 
with a given routing scheme. To clarify, take a graph and a pair of 
nodes in it, and then calculate shortest path length between them, then 
take the routing scheme, run it on the graph and calculate the distance 
between the same pair of nodes but along the path produced by the 
routing scheme. This second path length can be greater than the 
shortest distance, obviously. Find the ratio between this greater path 
length and the shortest path length. The stretch factor is the maximum 
of this ratio over all node pairs over all the graphs to which the 
routing scheme applies, while the average stretch is obviously the 
corresponding average. Given this definition, stretch-1 routing is 
synonymous to shortest path routing.

Are there any concerns associated with the stretch in practice (as 
opposed to in theory) today? The answer is fortunately no, since BGP 
implements a stretch-1 routing algorithm (although there is some 
"external" stretch associated with policy routing).

Clearly, what makes the theoretical scalability trade-offs outlined 
above the real problems is the properties of the real network, that is, 
the Internet. The Internet is a massive dynamic network. More 
importantly, the Internet topology is scale-free, meaning that the node 
degree distribution of the Internet inter-AS graph (as well as the 
router-level graph) is a power law with a negative exponent and, hence, 
it lacks any characteristic scale as opposed to Poisson node degree 
distributions in classical Erdos-Reny random graphs. A particularly 
important fact is that scale-free networks possess the so called 
small-world property: they are so densely connected that the average 
shortest path length (or, simply, the average distance) is very low, 
and so is the width of the distance distribution, so that one can say 
that the small-world graphs have very small first two moments of the 
distance distribution. Recall, that the average AS-hop distance in the 
Internet is approximately 3.5 and the dispersion of its distance 
distribution is around 1. Approximately 86% of AS pairs are at the 
distance of 3 or 4 AS hops. This small-world property affects 
drastically the traditional views, approaches, and even the methodology 
of network-related research. Scale-free networks possess rather 
intricate properties that are quite opposite to the properties of many 
other network classes that have been traditionally considered, very 
unfortunately, as more or less perfect models for realistic networks. 
As known today, those models have very little to do with reality.

The immediate causes of the first major practical problem (from the two 
mentioned above) with BGP routing today (the table size growth) are 
well studied. These causes are multihoming, increased topology density 
because of more peering (the "smaller-world" effect), inbound traffic 
engineering, address allocation policies, etc. There are several 
efforts, proposals, and even new routing architectures that are aimed 
at addressing those immediate causes. The most notorious efforts are 
the work of the MULTI6 working group, Atomized routing, and ISLAY 
routing architecture. However, for any of those proposals, it is easy 
to see that, if deployed, they would provide only a more or less 
short-term solution just because they address not the root problem but 
its consequences.

What is the root problem then? The answer is obvious. The problem is 
ultimately caused by the fundamental inefficiencies of algorithms 
underlying today's BGP (recall that BGP was originally constructed 
without any references to theoretical considerations of its performance 
on realistic network topologies), and the path to a potential solution 
must go through finding routing algorithms with better theoretical 
memory space mean values and upper bounds on scale-free graphs.

In the networking community, the single idea on how to make the routing 
table size scale well seems to be aggregation, multiple levels of 
hierarchical network partitioning in the Kleinrock-Kamoun (KK) style 
(L. Kleinrock and F. Kamoun, "Hierarchical Routing for Large Networks: 
Performance Evaluation and Optimization", Computer Networks, 1977, 
1:155-174), where the idea of hierarchical network partitioning was 
introduced and the trade-off between memory space and stretch was 
analyzed for the first time. Unfortunately, the scheme introduced there 
did not provide any algorithm for construction of the required 
partitioning. Furthermore, assuming that a required partitioning 
exists, the authors showed that the stretch produced by their scheme 
would be reasonable low only for very sparsely connected networks, 
while realistic scale-free networks are, on the contrary, extremely 
densely connected. As was recently shown (D. Krioukov, K. Fall, and X. 
Yang, "Compact Routing on Internet-Like Graphs", INFOCOM 2004, 
submitted) (KFY), the stretch produced by the KK scheme on the Internet 
topology would be extremely high. This is another reason why the 
routing architectures based on the idea of hierarchical network 
partitioning (Nimrod, ISLAY, etc.) are not realistic (the primary 
reason being the lack of any algorithm to construct  required 
partitioning for a given network topology).

On the other hand, it would be quite hard to expect that no progress 
has been made for routing in distributed computing since 1977. So that, 
one can turn his attention to more modern, compact routing schemes that 
guarantee very small routing table sizes but with smaller (than in the 
KK case) increase of stretch. Note that the stretch has to be allowed 
to increase since generic (applicable to all graphs) shortest path 
routing was shown to be incompressible (the shortest path routing table 
size lower bound cannot be made reasonably small for some graphs).

However, it is straightforward to see that ability to do stretch-1 
routing is an essential requirement for any realistic Internet routing 
architecture. So that, the primary concern with those compact routing 
schemes is how far their average stretch on scale-free graphs is from 
1. In other words, the very specific first question on the path toward 
scalable Internet routing is this: is it possible to construct a 
routing scheme that would be characterized by simultaneously low memory 
space and stretch on the scale-free graphs with Internet-like 
topologies? This question has been answered positively in KFY. In other 
words, one of the three trade-offs mentioned above has been shown to 
have a positive resolution, assuming the adaptation costs are unbounded.

2. Next Steps

2.1 Routing

2.1.1 Other Schemes

The analysis similar to KFY can be done for some other static compact 
routing schemes. The average stretch of the scheme considered in KFY on 
the Internet interdomain graph has still been found to be somewhat high 
(~1.1). There are routing schemes with slightly higher memory space 
upper bounds, but those schemes are expected to result in much lower 
stretch averages on scale-free graphs. Thus, such schemes might be much 
more attractive from the practical standpoint.

2.1.2 Additive Stretch

Nothing constrains one to considering only multiplicative stretch 
discussed above. The additive stretch and constructs utilizing it have 
been also defined and analyzed to some degree. The additive stretch 
might be more applicable to small-world graphs since the multiplicative 
stretch seems to be too coarse for graphs where short distances prevail.

2.1.3 Shortest Path Routing

Nothing is know regarding stretch-1 bounds on scale-free graphs. On one 
hand, stretch-1 routing is a requirement for Internet routing. On the 
other  hand, stretch-1 routing is incompressible on generic graphs. 
However, if the class of graphs is narrowed to the scale-free graphs, 
can any nicer lower bounds be obtained then, and can any stretch-1 
routing schemes with nicer upper bounds be then constructed?

2.1.4 Stretch>1 Path Bounds

A closely related question is about the bounds for the amount of 
stretch>1 paths produced by stretch>1 routing schemes on scale-free 
graphs. If these bounds turn out to be not greater than the memory 
space upper bounds, than routing information for non-shortest paths can 
be augmented with the corresponding shortest path routing information 
without increasing the memory space upper bounds.

2.1.5 The Dynamic Case

Routing on dynamic graphs is, of course, a much more complicated 
problem. First of all, it admits several theoretical formulations, some 
of which are irrelevant. The progress on the problems that are relevant 
for Internet routing has been very slow. There are very few results 
obtained for dynamic routing in distributed computing over the past 20 
years. One can easily see that the problem is hard at the very 
fundamental level. The idea of representing a highly dynamic object, a 
real network, by an intrinsically static construct, a graph, seems to 
be a root cause of these difficulties. This apprehension is indirectly 
confirmed by several rather pessimistic lower bounds that are already 
known. For example, for any generic dynamic stretch-s routing scheme 
with unbounded memory space and message sizes, not less than n/s 
messages are required per some topology changes on some n-node graphs. 
The corresponding bounds for scale-free networks are unknown.

The first step towards the dynamic case is obviously the analysis of 
the name-independent routing schemes. The scheme considered in KFY is 
not name-independent: as soon as some node or link failure occurs, 
there are no bounds for the amount of nodes that need to be relabeled 
as a result of such topology change. Some significant progress has been 
recently made on the front of construction of low-stretch 
name-independent (but still static) compact routing schemes.

In general, the ultimate purpose is to construct a dynamic low-stretch 
compact routing scheme that would work well on dynamic scale-free 
graphs, but the analysis of the other two trade-offs (adaptations costs 
vs. memory space and adaptation costs vs. stretch) narrowed to 
scale-free graphs has not begun yet at all. This is a virtually 
untouched research area.

2.1.6 Routing Paradigm Shifts

It is quite plausible that the real cause of slow progress for dynamic 
routing in the framework of distributed computing is that the problem 
is fundamentally hard (or even unsolvable) there. So that, it may be 
reasonable to approach the problem from some other directions. One can 
see that this idea of horizon expansion in network research becomes 
more and more popular lately. A good example is some papers presented 
at SIGCOMM 2003.

As far as routing is concerned, what seems to be the most reasonable 
first step is to formulate the routing problem in a truly dynamic 
context. There are quite rigorous considerations (as opposed to just 
bare enthusiasm) showing that possibility of doing so looks fairly 
plausible (cf. the
Midnight Sun Routing Workshop, Lulea, Sweden, 2002).

2.2 Internet Topology and Evolution

2.2.1 Topology

Construction of efficient graph-theoretic routing for realistic 
networks is clearly impossible without thorough understanding of their 
topology and evolution. Unfortunately, very little is known about the 
properties of realistic scale-free networks from the graph-theoretic 
perspective. Much greater number of results on realistic networks have 
recently come from statistical mechanics than from computer science. 
For example, all known results on distance distributions in 
uncorrelated scale-free graphs were obtained by physicists, and there 
are no analytical results for distance distribution in strongly 
correlated scale-free graphs such as the Internet (all growing networks 
are strongly correlated).

2.2.2 Evolution

Since the discovery of power-laws in the Internet, the number of models 
trying to explain them has been growing very rapidly. However, there is 
still no Internet evolution model that would be satisfactory from both 
the physical and networking standpoints. As a result, the laws 
governing the Internet evolution remain unclear, which is very 
unfortunate since knowing such laws would potentially allow one to 
influence the Internet topology evolution to result in graphs that 
would be "nicer" with respect to routing.

The Barabasi-Albert (BA) model and its derivatives, popular among 
physicists, have seen a lot of criticism from the networking community 
for being too general, not incorporating any domain specifics, and, 
hence, failing to predict correctly many characteristics of the 
Internet topology and evolution. The same type of argument has been 
actively used against the BA model by biologists.

On the other hand, the models proposed by the networking community try 
to incorporate Internet evolution specifics by introducing a number of 
non-physical parameters allowing one to easily fit the output of a 
model to the observed data. It is easy to see that any model with 
sufficient number of external parameters can be forced to produce any 
required output by parameter manipulations. A model can be of some 
theoretical value only when all its parameters can be expressed via 
physical variables.

Therefore, the lack of appropriate and commonly accepted methodology is 
the central hindrance on the path towards construction of an Internet 
evolution= model that would be both sufficiently detailed and having 
physical meaning. While paying special attention to Internet business 
realities seems to be the right starting point on this path, using the 
right methodology is the key.

3. Proposed Research Topics for the RRG

3.1 Dynamic Routing Schemes

3.1.1	Specific problem:

Analysis (similar to KFY) of the latest name-independent compact 
routing scheme (M. Arias, L. Cowen, K. A. Laing, R. Rajaraman, and O. 
Taka, "Compact routing with name independence," in Proceedings of he 
15th Annual ACM Symposium on Parallel Algorithms and Architectures 
(SPAA), pp. 184-192. ACM, 2003).

3.1.2 General problem:

Lower bounds for dynamic routing schemes on scale-free graphs.

3.1.3 If the answer to 3.1.2 is "negative", then either

3.1.4a 	The paradigm-shifting research (physical routing, etc.)

3.1.4b	"Delay tolerant networking!" (for the future Internet composed 
of routable islands with just periodic connectivity between them (i.e. 
no global "routability"))

3.2	Internet Topology Evolution

3.2.1	Explanation of effects discovered in KFY.

3.2.2 After finding the laws of the Internet topology evolution, can we 
influence them to evolve to topologies that would be "nicer" with 
respect to routing ("multihomers-should-pay" is an (oversimplified) 
idea from this class)?