|
Webometry:
measuring the complexity of the World Wide Web Based on a talk
in Vienna, at FIS96, 6.15.96
Abstract The explosive growth of the WWW may be viewed as the neurogenesis phase in the embryogenesis of a new planetary civilization. To empower this emergent phenomenon with self-reflection, we propose strategies for the visualization of the complexity of the WWW, seen as a neural net. The pointwise fractal dimension of a massive matrix is the basis of our strategy.
Introduction The World Wide Web (WWW) has grown explosively in five years from a novel idea of Tim Berners-Lee to the nervous system of a new planetary society. One wonders what to make of this, and perhaps the various opinions correspond to the historical paradigms. Here are four of them.
Connectionism The mathematics of morphogenesis, complex dynamical systems theory, is the basis of our strategies for visualizing the Web. Thus we view the Web as a neural net, that is, a massive web of neurons or nodes. While neurons are not dumb, connectionism views the intelligence of the network as primarily derived from its connections, as opposed to its nodes. While the number and sophistication of nodes may increase during neurogenesis, a maximum population is eventually attained. Meanwhile, the network of connections develops during embryogenesis, but then continues indefinitely. This is the physiological basis of learning, for example. In the simple models for neural nets provided by the mathematics of complex dynamical systems, the connections are represented by real numbers. Given two nodes, n(i) and n(j), the connection from the first to the second is represented by a single real number, g(i, j), denoting the strength of the connection. All of this data, the g(i, j), may be set out in a single tableau, which is a square matrix of size N, the total number of nodes. After maturity is attained by the evolving neural net, this number may be regarded as fixed, although perhaps enormously large. The further evolution, such as learning, is then manifest by changes in this large matrix of real numbers. And it is this matrix which we wish to observe, in Operation Web Watch, and to present to the web-literate public, the cybercitizens of the future planetary society, in order to empower self-reflection on this morphogenetic process, in which we may consciously participate in the creation of the future.
Visualization of massive neural nets Suppose given a massive neural net, that is, for which the size, N, may be on the order of tens or hundreds of thousands. How to observe its instantaneous state, or a sequence of states, to understand its evolution? In this paper we present only one of many possible strategies, already inherent in the neural net approach: the view of the matrix of connection strengths as a two-dimensional image. This may be done in shades of gray, or through translation by a color lookup table. There are two serious problems with this approach. Neverthe- less, we advocate it here, and plan to pursue it in further work. The first problem is in the massive size of the image. As computer screens and printed pages are generally limited to a size of one thousand or so, the literal image of a matrix of size N as conceived here must cover many computer screens, or many pages of print. The obvious solution to this problem of massive size is an intentional reduction of resolution, by pixel averaging for example. The second problem is in the fictitious representation of the nodes in linear order, that is, as a one- dimensional geographic space, when in fact, the ordering given by the index (I) is arbitrary, or logical, or anything but geographical. In case there is a geometric or geographical map for the nodes of the neural net, its dimension is usually greater than one, and so the representation within a one-dimensional space is forced and artificial. (Note: Complex dynamical systems with geometric reference spaces have been discussed in the literature. For example, with a two-dimensional reference space, the connection matrix may be embedded in four dimensions, giving rise to a four- dimensional image.) Worse yet, these two problems aggravate each other. For averaging neighboring pixels, when the proximity of nodes has no natural significance, may destroy all significance in the image, providing a very foggy (that is, fractal) visualization of the net. Nevertheless, we feel this approach has a certain promise, as fractal geometry provides tools for studying foggy (fractal) images. And here we propose just one of these tools: the pointwise fractal dimension. By computing the fractal dimension of the large matrix at each point, we obtain another matrix of the same size. This derived matrix may be viewed as a topography of complexity, a parameter of considerable significance in the context of morphogenesis, even of foggy images. And furthermore, the derived image of the complexity of the original image may be expected to behave well under pixel averaging, or other resolution reducing transformations. For this invariance under scaling is a characteristic of fractals. In summary, here is our proposal for viewing the morphogenetic process of a massive neural net:
Given a time series of connection matrices, compute the derivatives D and E for each, and view the time series of matrices, E, as a time-lapse movie of the morphogenesis of the net.
Measuring the WWW Our strategy for viewing the morphogenetic process of a massive neural net may be applied to the WWW. That is indeed the main point of this paper. But how to represent the Web as a Net? There are clearly two necessary steps: to define the nodes, and to measure the connection strengths. For each of these steps there are many possibilities. Here we describe only one approach to each.
Conclusion We have described a complete, step-by-step, procedure for the vizualization of the complexity and morphogenesis of the World Wide Web. The implementation of this procedure, our next goal, aims at the installation of a website in which, like a weather report, the current web image, and movies of earlier web images, are available for browsing. The stages of this implementation, in review, are: obtain connection matrix data for domains *.org, *.edu from a web crawler transform to a matrix of pointwise fractal dimension reduce by pixel averaging post as GIF images on the web We see this as a relatively simple program, the first step being the most difficult. For this first step we see two options: one is to write our own web crawler, the other is to enter into partnership with one of the existing WWW-index services, such as: Alta Vista, Yahoo, Excite, etc.
Acknowledgments Thanks to my class, Webology, at the University of California at Santa Cruz, Spring 1996, for the opportunity of testing these ideas on an unsympathetic audience, and to Don Foresta of the University of Paris for suggesting this idea in the first place. In a joint research project currently under way, we hope to actually carry out the fractal dimension strategy, presenting our results on the WWW at http://www.vismath.org/webometry. Many thanks to the London School of Economics and the University of Paris for grants making this research possible, and to the Istituto di Scienze Economiche of the University of Urbino for hospitality during the writing of this paper.
Bibliography Abraham, Fred D., Dynamical modeling and research of collective cognition, J. World Futures, to appear Abraham, Ralph H., Complex dynamics, Santa Cruz, CA: Aerial Press,1991. Abraham, Ralph H., Frank Jas, and Willard Russell, The Web Empowerment Book, New York: Springer-Verlag, 1995. Chaisson, Eric, The Life Era, New York: Atlantic Monthly Press, 1987. Farmer, J. Doyne, E. Ott, and J. Yorke, Fractal dimension, Physica D, 7 (1983), p. 153, Grossberg, Stephen, and Michael Kuperstein, Neural Dynamics of Adaptive Sensory-motor Control, New York: Pergamon Press, 1989. Laszlo, Ervin, Evolution: the Grand Synthesis, Boston: New Science Library, 1987. Mandelbrot, Benoit, The Fractal Geometry of Nature, New York: W. H. Freeman, 1877/1982. Russell, Peter, The Global Brain, Los Angeles: J.P. Tarcher, 1983. Sheldrake, Rupert, A New Science of Life, London: Blond and Briggs, 1981. Copyright:
Ralph Abraham Copyright: The files in this library are transmitted under the "Fair Use" rulings regarding the 1976 Copyright Act for non-profit academic, research, and general information purposes. This text is, to the best of our knowledge, out of copyright and in the public domain and are available for your pleasure and education. If there is the slightest copyright ambiguity, please let us know and will immediately address the issue or remove the file. Copyright 2001 deepleaf productions. All Rights Reserved. |
||||||