Communications


go to feature story

Viewpoint: Self-similarity upsets data traffic assumptions

William Stallings

In 1993, the field of network performance modeling was rocked by a group of BellCore and Boston University researchers who delivered a paper at that year's SIGCOMM (Special Interest Group on Data Communications) conference. "On the Self-Similar Nature of Ethernet Traffic," which appeared the following year in the IEEE Transactions on Networking [see To Probe Further, pp. 103­4], is arguably the most important networking paper of the decade. Although a number of researchers had observed over the years that network traffic did not always obey the Poisson assumptions used in queuing analysis, the paper's authors for the first time provided an explanation and a systematic approach to modeling realistic data traffic patterns.

Simply put, network traffic is more bursty and exhibits greater variability than previously suspected. The paper reported the results of a massive study of Ethernet traffic and demonstrated that it has a self-similar, or fractal, characteristic. That means that the traffic has similar statistical properties at a range of time scales: milliseconds, seconds, minutes, hours, even days and weeks. This has several important consequences. One is that you cannot expect that the traffic will "smooth out" over an extended period of time; instead, not only does the data cluster, but the clusters cluster. Another consequence is that the merging of traffic streams, such as is done by a statistical multiplexer or an asynchronous-transfer mode (ATM) switch, does not result in a smoothing of traffic. Again, multiplexing bursty data streams tends to produce a bursty aggregate stream.

One practical effect of self-similarity is that the buffers needed at switches and multiplexers must be bigger than those predicted by traditional queuing analysis and simulations. Further, these larger buffers create greater delays in individual streams than originally anticipated.

Self-similarity is not confined to Ethernet traffic or indeed to local-area network traffic in general. The 1994 paper sparked a surge of research in the United States, Europe, Australia, and elsewhere. The results are now in: self-similarity appears in ATM traffic, compressed digital video streams, Signaling System Seven (SS7) control traffic on networks based on the integrated-services digital network (ISDN), Web traffic between browsers and servers, and much more.

The discovery of the fractal nature of data traffic should not be surprising. Such self-similarity is quite common in both natural and man-made phenomena; it is seen in natural landscapes, in the distribution of earthquakes, in ocean waves, in turbulent flow, in the fluctuations of the stock market, as well as in the pattern of errors and data traffic on communication channels.

The implications of this new view of data traffic are startling and reveal its importance. For example, the whole area of buffer design and management requires rethinking. In traditional network engineering, it is assumed that linear increases in buffer sizes will produce nearly exponential decreases in packet loss and that an increase in buffer size will result in a proportional increase in the effective use of transmission capacity. With self-similar traffic, these assumptions are false. The decrease in loss with buffer size is far less than expected, and a modest increase in utilization requires a significant increase in buffer size.

Other aspects of network design are also affected. With self-similar traffic, a slight increase in the number of active connections through a switch can result in a large increase in packet loss. In general, the parameters of a network design are more sensitive to the actual traffic pattern than expected. To cope with this sensitivity, designs need to be more conservative. Priority scheduling schemes need to be reexamined. For example, if a switch manages multiple priority classes yet does not enforce a bandwidth limitation on the class with the highest priority, then a prolonged burst of traffic from the highest priority could keep the other classes from using the network for an extended period of time.

The explanation for these strange results is that surges in traffic tend to occur in waves. There may be a long period with very little traffic followed by an interval of heavy usage in which traffic peaks tend to cluster, making it difficult for a switch or network to clear the backlog from one peak before the next peak arrives. Therefore, a static congestion control strategy must assume that such waves of multiple peak periods will occur. A dynamic congestion control strategy is difficult to implement. Such a strategy is based on the measurement of recent traffic, and it can fail utterly to adapt to rapidly changing conditions.

Congestion prevention by appropriate sizing of switches and networks is difficult because data network traffic does not exhibit a predictable level of busy period traffic; patterns can change over a period of days, or weeks, or months. Congestion avoidance by monitoring traffic levels and adapting flow control and traffic routing policies is difficult because congestion can occur unexpectedly and with dramatic intensity. Finally, congestion recovery is complicated by the need to make sure that critical network control messages are not lost in the repeated waves of traffic hitting the network.

The reason that this fundamental nature of data traffic has been missed up until recently is that it requires the processing of a massive amount of data over a long observation period to detect and confirm this behavior. And yet the practical effects are all too obvious. ATM switch vendors, among others, have found that their products do not perform as advertised once in the field, because of inadequate buffering and the failure to take into account the delays caused by burstiness.

The true nature of high-speed data traffic has now been revealed. As yet, a consensus on a valid and efficient set of mathematical tools for modeling and predicting such traffic has not emerged. This will be the next step in this important area of research.


William Stallings (SM) is a consultant, lecturer, and author of over a dozen books on data communications and computer networking. His Computer Organization and Architecture received the award for the best computer science textbook of 1996 from the Textbook and Academic Authors Association. His latest book is Data and Computer Communications, Fifth Edition (Prentice-Hall, Englewood Cliffs, N.J., 1997). He can be reached at ws@shore.net.

(c) Copyright 1997, The Institute of Electrical and Electronics Engineers, Inc.