Evolution and Structure of the Internet: A Statistical Physics Approach

David Bawden (City University, London, UK)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 June 2005

490

Keywords

Citation

Bawden, D. (2005), "Evolution and Structure of the Internet: A Statistical Physics Approach", Journal of Documentation, Vol. 61 No. 3, pp. 442-443. https://doi.org/10.1108/00220410510598571

Publisher

:

Emerald Group Publishing Limited

Copyright © 2005, Emerald Group Publishing Limited


The internet is indisputably the information phenomenon of our times, and has been studied from numerous viewpoints. Pastor‐Satorras and Vespignani take an unusual approach, applying the methods of statistical physics, and of complex systems theory, to the growth and structure of the internet and of the Worldwide web. Their approach is a deliberately limited one, examining only the topological links ‐ physical and virtual ‐ which make up these networks, while largely ignoring semantic aspects. This makes their approach somewhat similar to traditional bibliometrics, though it is considerably more sophisticated than many of the methods used for studying the internet and web under the headings of “webometrics” or “webliometrics” (for a comprehensive review of these, see Thewall, Vaughan and Björneborn, 2004).

After an opening discussion on the historical development and underlying technologies of the Internet, the book proceeds rapidly to means for determining the size, structure and statistical regularities of, first, the internet, and, second, the web. The crucial difference between them for these purposes is the Web, unlike the physical internet, has the structure of a directed graph. While the connections into and out of an internet server may both be assessed by survey software, only the links out from a web site can be readily determined in this way.

With this proviso, the methods of analysis are similar for both; determine the characteristics of the network, and then attempt to model them by an appropriate distribution or function drawn from physical theory. The results for the two are substantially the same. Neither the internet nor the web can be adequately modelled by standard static network models; a proper account needs to deal with their dynamic nature, and with the emergent properties observed. Both are found to fall within the class of growing scale‐free networks, with the same patterns and distributions seen at different levels of “magnification” and detail. These similarities suggest that both follow the same principles found in “self‐organising” systems. Both exhibit strong clustering, with groups of tightly connected nodes, seen in many complex social and technical networks. The web, in particularly, can be seen to exhibit the “small world” phenomenon, by which any two nodes are joined by quite small routes.

Although, as noted above, these studies largely ignore content and semantics, there are, none the less, intriguing suggestions that the distributions found may vary according to the nature of the material – newspaper files, for example, as against scientific material – again showing an analogy with bibliometrics. The authors develop these thoughts into speculations about the way in which this kind of knowledge of internet/web structure may be applied in the design of improved search engines, and in creating antidotes to viruses and other harmful manifestations.

A general conclusion is that the large‐scale properties of the kind examined here are likely to remain largely invariant to changes of technology. This suggests that studies of this kind should be considered as valuable to those involved in the study of networked digital information, and in the practice of its provision, as bibliometrics has been to the traditional documentalist. This book is therefore to be welcomed. While it may prove rather hard going in parts to readers without a grounding in basic mathematics and statistics, it is generally clear and well written. It is unfortunate that its title, and the suggested audience of “researchers in statistical physics, computer science and mathematics”, are not likely to endear it to those in the softer information sciences who might find it of interest and value.

References

Thewall, M., Vaughan, L. and Björneborn, L. (2004), “Webometrics”, Annual Review of Information Science and Technology, Vol. 39, Information Today, Medford, NJ, pp. 81135.

Related articles