In large-scale distributed computing systems, in which the computational elements are physically or virtually distant from each other, there are communication-related delays that can significantly alter the expected performance of load-balancing policies that do not account for such delays. This is a particularly significant problem in systems for which the individual units are connected by means of a shared broadband communication medium (e.g., the Internet, ATM, wireless LAN or wireless Internet). In such cases, the delays, in addition to being large, fluctuate randomly, making their one-time accurate prediction impossible. In this work, the stochastic dynamics of a load-balancing algorithm in a cluster of computer nodes are modeled and used to predict the effects of the random time delays on the algorithm’s performance. A discrete-time stochastic dynamical-equation model is presented describing the evolution of the random queue size of each node. Monte Carlo simulation is also used to demonstrate the extent of the role played by the magnitude and uncertainty of the various time-delay elements in altering the performance of load balancing. This study reveals that the presence of delay (deterministic or random) can lead to a significant degradation in the performance of a load-balancing policy. One way to remedy such a problem is to weaken the load-balancing mechanism so that the load-transfer between nodes is down-scaled (or discouraged) appropriately.
Abdallah, Chaouki T.; Majeed M. Hayat; Sagar Dhakal; J. Douglas Birdwell; and John Chiasson. "Dynamic time delay models for load balancing, Part II: A stochastic analysis of the effect of delay uncertainty." (2012). http://digitalrepository.unm.edu/ece_fsp/16