Monday, October 20, 2008

Power Law Relationships of the Internet Topology

This paper investigates aspects of internet topology, determining whether power laws of the form "x is proportional to y^a" for quantities x, y, and constant "a" exist in the graphs representing the Internet at the inter-AS and inter-router level. The motivation of this is that using averages/min/max distills the data too far: it gives very little information about the actual distribution of values.

As an alternative, this paper proposes essentially ordering the nodes (whether they are ASs or routers) by some metric, and then expressing the slope of the distribution on a log-log scale (which is "a" in the power law). Then, to evaluate the accuracy/effectiveness of the power law, the authors correlate the slope with the actual values (i.e. they measure the closeness of the fit). In addition, it is important to remember the paper's goal is to come up with some good metrics to measure the validity of internet graph model generators.

The first thing that jumps out to me about the paper is that the dataset used to determine the fits is quite small: three inter-AS topologies from 6-month intervals and a single router topology from a different, earlier time period. Along with the fact that the datasets are so small, there is no measure of accuracy for any of the data points; the fact that the router topology does not correspond to the inter-AS data also makes it difficult to tell whether the fits mean anything.

Admittedly, I'm out of my depth in trying to evaluate the soundness of their statistical methods, but the authors seem conclude that the close correlations are evidence enough that the power laws actually mean something. Further, it is difficult to tell exactly what the power laws mean in terms of how ASs change their interconnections over time according to the goals and restrictions of each. For example, their 1.45x number of nodes increase per year seems like an interesting number; although it may hold for ASs, I think for routers it is unlikely to hold (not that they could test it, given their one data point) since ASs may add routers or replace them.

1 comment:

Matei Zaharia said...

I agree about not understanding what the power laws mean. Some kind of "Moore's law" for internet growth makes sense, but some of the other laws, like node rank vs degree, or distribution of eigenvalues, are just strange. My question is also, how many of these would arise naturally in any graph you build by some kind of constructive process like the Internet? Maybe the laws don't mean anything, in the same way that saying that every set of numbers has an average doesn't really mean anything.