Font Size: a A A

Theoretical and empirical study of the dynamics of Web request streams

Posted on:2008-06-20Degree:Ph.DType:Dissertation
University:Boston UniversityCandidate:Beriont, Walter JosephFull Text:PDF
GTID:1445390005462396Subject:Engineering
Abstract/Summary:
The complexity of the World Wide Web makes statistical analysis the method of choice for exploring the Web's behavior. Statistical analysis suggests insights for amending underlying HTTP protocols, installing a hierarchy of network caches, adopting specific cache replacement policies, as well as including mechanisms to balance system workloads. However, after fifteen years in development, there are still open research questions regarding the appropriate cache replacement policies, how best to organize network resources, as well as protocols to allow the Web to effectively scale with its growing popularity.;The immense size and unprecedented growth of the Web limits the effectiveness of empirical analysis. Despite the large number of statistical studies on characteristics of proxy workloads, there is little insight into their dynamics. In order to understand these workload dynamics, it is crucial to develop a theoretical model that explains the emergence of the observed empirical characteristics. Two of the more notable and undiscerning characteristics are Zipf-like distributions and one-timers. The consistencies in which researchers have documented these and similar characteristics imply that they are inherent to the dynamics of the Web. This dissertation investigates the dynamics of Web proxy request streams.;The methodology used in this research reviews current conclusions from relevant research and conducts independent statistical analysis of Web proxy workloads. A theoretical model drawn from the works of Levitin and Schapiro on the mechanism leading to the Zipf law is applied. This analysis leads to the discovery of an important phenomenon---the presence of a finite depth memory in the process. This phenomenon requires modification of the model by introduction of a second parameter. Using this model, the theory suggests complex dynamic mechanisms at play within a proxy workload. The theory provides expressions for variables countable from empirical data and a method to extract the model's two parameters. Using Monte Carlo methods, the research compares the model to empirical proxy request streams. This comparison finds good correlation. The theoretical model provides insight into the fundamental but arcane relationships found in Web proxy request streams that can be used to analyze and improve the operation of Web.
Keywords/Search Tags:Web, Request streams, Empirical, Dynamics, Statistical analysis, Theoretical
Related items