The diameter of a scale-free random graph.

*(English)*Zbl 1047.05038There has recently been substantial interest in models of random graphs where vertices are added to the graph succesively and are more likely to be joined to vertices which are already of high degrees. Such graphs are thought to be good models of various \` real-world\' graphs such as the world-wide web, where pages are likely to have links to already popular sites such as Google.

In A.-L. Barabási and R. Albert [Emergence of scaling in random networks, Science, 286, 509–512 (1999)] a particular model along these lines is (nearly) defined. They start with a small number \(m_{0}\) of vertices. Each vertex added subsequently is made adjacent to \(m\) different vertices already in the graph, and preferential attachment is ensured by saying the probability that the new vertex is adjacent to an existing vertex of degree \(d_{i}\) is \(d_{i}/\sum_{j}d_{j}\) (the sum over all vertices already in the graph). This will, after \(t\) steps, yield a graph with \(t+m_{0}\) vertices and \(mt\) edges. This is not an unambiguous description, but various papers have studied interpretations of this description and investigated the diameter (the maximum, over all pairs of vertices, of the distance between them in the graph). These papers suggest, on the basis of computer experiments, that the diameter should typically be about \(C\log(n)\) where \(C\) is a constant and \(n\) the number of vertices. This appears to be regarded, in many such papers, as surprisingly small: the authors of the paper under review point out some reasons why it is actually surprisingly large.

In the paper under review, the authors define a precise form of the Barabási-Albert model. We omit the details of how it is made precise, merely noting that at time \(n\) the random graph \(G^{n}_{m}\) produced has \(n\) vertices, that (for \(n\) large) most vertices do indeed have degree \(m\), and that \(G^{n}_{m}\) can be obtained from \(G_{1}^{mn}\) by identifying vertices \(1,2,\dots, m\) of a \(G_{1}^{mn}\) into one vertex, similarly vertices \(m+1,\dots, 2m\) of \(G_{1}^{mn}\) are identified into a second vertex of \(G_{m}^{n}\), and so on. (In short: we can do much of the work in the case \(m=1\).)

The authors’ main result is that, if \(m\geq 2\) and \(\varepsilon>0\) are fixed, then \[ \lim_{n\rightarrow\infty} P\left((1-\varepsilon)\frac{\log(n)}{\log\log(n)}\leq \text{diam}(G^{n}_{m})\leq (1+\varepsilon)\frac{\log(n)}{\log\log(n)}\right)=1 \] which is indeed smaller than the answer predicted by the computer experiments. (For \(m=1\), an earlier result from B. Pittel [Random Struct. Algorithms 5, 337–347 (1994; Zbl 0790.05077)] states essentially that the diameter is indeed of order \(\log(n)\).)

The lower bound on the diameter is obtained by comparing \(G_{1}^{N}\), where \(N=nm\), with a random graph where edge \(ij\) is present with probability \(C/\sqrt{ij}\) independently of all other edges. Summarising crudely, it turns out, via some analysis of the random variable \(g_{j}\) (the vertex to which vertex \(j\) sends an edge), that small graphs are not much more likely to occur in \(G_{1}^{N}\) than in this other random graph: in particular, there is a good chance of no short paths in the graphs. In fact the argument proves the noticeably stronger conclusion that, as \(n\) tends to infinity the probability that the distance between the two most recent vertices \(n\) and \(n-1\) is at least \(\log(n)/\log(3Cm^{2}\log(n))\) (for a certain constant \(C\)) tends to 1.

The upper bound is harder. Let \(v\) be a vertex, and \(N_{k}(v)\) be the set of vertices within distance \(k\) of \(v\). We assign a weight \(i^{-1/2}\) to the \(i\)th vertex: it then turns out that, given the weight \(w(N_{k-1})\) of \(N_{k-1}\), then \({\mathbf E}(w(N_{k}))=\log(n)w(N_{k-1})\) (provided this is not too large). This shows that in \(\leq \Theta \log(n)/\log\log(n)\) steps from \(v\), one can reach \` many\' vertices: one then completes the proof by showing that one can almost certainly reach the first vertex in one step from these \` many\' vertices. This simplistic overview hides many technical difficulties, which include the fact that one has first to condition on certain nice uniformity properties of the random graph (which are proved using an alternative description of it in terms of a random pairing) and the fact that \(w(N_{k})\) has a very skew distribution for smallish \(w(N_{k-1})\), necessitating the careful control of a \` blow-up\' factor.

The authors conclude with a discussion of possible extensions to their theeorem to random graphs with a power law distribution on their degrees and some indications that their results are robust to some vertices being destroyed.

In A.-L. Barabási and R. Albert [Emergence of scaling in random networks, Science, 286, 509–512 (1999)] a particular model along these lines is (nearly) defined. They start with a small number \(m_{0}\) of vertices. Each vertex added subsequently is made adjacent to \(m\) different vertices already in the graph, and preferential attachment is ensured by saying the probability that the new vertex is adjacent to an existing vertex of degree \(d_{i}\) is \(d_{i}/\sum_{j}d_{j}\) (the sum over all vertices already in the graph). This will, after \(t\) steps, yield a graph with \(t+m_{0}\) vertices and \(mt\) edges. This is not an unambiguous description, but various papers have studied interpretations of this description and investigated the diameter (the maximum, over all pairs of vertices, of the distance between them in the graph). These papers suggest, on the basis of computer experiments, that the diameter should typically be about \(C\log(n)\) where \(C\) is a constant and \(n\) the number of vertices. This appears to be regarded, in many such papers, as surprisingly small: the authors of the paper under review point out some reasons why it is actually surprisingly large.

In the paper under review, the authors define a precise form of the Barabási-Albert model. We omit the details of how it is made precise, merely noting that at time \(n\) the random graph \(G^{n}_{m}\) produced has \(n\) vertices, that (for \(n\) large) most vertices do indeed have degree \(m\), and that \(G^{n}_{m}\) can be obtained from \(G_{1}^{mn}\) by identifying vertices \(1,2,\dots, m\) of a \(G_{1}^{mn}\) into one vertex, similarly vertices \(m+1,\dots, 2m\) of \(G_{1}^{mn}\) are identified into a second vertex of \(G_{m}^{n}\), and so on. (In short: we can do much of the work in the case \(m=1\).)

The authors’ main result is that, if \(m\geq 2\) and \(\varepsilon>0\) are fixed, then \[ \lim_{n\rightarrow\infty} P\left((1-\varepsilon)\frac{\log(n)}{\log\log(n)}\leq \text{diam}(G^{n}_{m})\leq (1+\varepsilon)\frac{\log(n)}{\log\log(n)}\right)=1 \] which is indeed smaller than the answer predicted by the computer experiments. (For \(m=1\), an earlier result from B. Pittel [Random Struct. Algorithms 5, 337–347 (1994; Zbl 0790.05077)] states essentially that the diameter is indeed of order \(\log(n)\).)

The lower bound on the diameter is obtained by comparing \(G_{1}^{N}\), where \(N=nm\), with a random graph where edge \(ij\) is present with probability \(C/\sqrt{ij}\) independently of all other edges. Summarising crudely, it turns out, via some analysis of the random variable \(g_{j}\) (the vertex to which vertex \(j\) sends an edge), that small graphs are not much more likely to occur in \(G_{1}^{N}\) than in this other random graph: in particular, there is a good chance of no short paths in the graphs. In fact the argument proves the noticeably stronger conclusion that, as \(n\) tends to infinity the probability that the distance between the two most recent vertices \(n\) and \(n-1\) is at least \(\log(n)/\log(3Cm^{2}\log(n))\) (for a certain constant \(C\)) tends to 1.

The upper bound is harder. Let \(v\) be a vertex, and \(N_{k}(v)\) be the set of vertices within distance \(k\) of \(v\). We assign a weight \(i^{-1/2}\) to the \(i\)th vertex: it then turns out that, given the weight \(w(N_{k-1})\) of \(N_{k-1}\), then \({\mathbf E}(w(N_{k}))=\log(n)w(N_{k-1})\) (provided this is not too large). This shows that in \(\leq \Theta \log(n)/\log\log(n)\) steps from \(v\), one can reach \` many\' vertices: one then completes the proof by showing that one can almost certainly reach the first vertex in one step from these \` many\' vertices. This simplistic overview hides many technical difficulties, which include the fact that one has first to condition on certain nice uniformity properties of the random graph (which are proved using an alternative description of it in terms of a random pairing) and the fact that \(w(N_{k})\) has a very skew distribution for smallish \(w(N_{k-1})\), necessitating the careful control of a \` blow-up\' factor.

The authors conclude with a discussion of possible extensions to their theeorem to random graphs with a power law distribution on their degrees and some indications that their results are robust to some vertices being destroyed.

Reviewer: David Penman (Colchester)

##### MSC:

05C80 | Random graphs (graph-theoretic aspects) |