Font Size: a A A

Research Of Internet Content Outsourcing Policy Based On Complex Network Clustering Algorithm

Posted on:2011-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y TianFull Text:PDF
GTID:2178360305454914Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer network technology,Network has penetrated into every corner of the world. The internet has changed rapidly from a simple information-sharing carrier offering only static text and images to a rich assortment of dynamic and interactive services.However,its explosive growth has imposed a heavy demand on Web servers.When surfing on the Internet,we often suffer from some issues such as low response speed, server error or server unable to access and so on. Users experience long and unpredictable delays when retrieving Web pages from remote sites.Network delays inevitable.And users' demands for network system' s performance become increasingly,such as response time and reliability provided by the web.The obvious solution to improve the quality of Web services would be the increase of the bandwidth, but such a choice involves increasing the economic cost. Besides,the higher bandwidth would solve the only temporarily problems.A traditional method to solve this situation is caching.Though caching offers several benefits,likeshorter response times,reduced network traffic,it has drawbacks,e.g.,small hit rates.To compensate for such problems, traditional caching is coupled with prefetching, which aims at predicting future requests for Web objects and bringing those objects into the cache in the background.In view of this,CDN is to establish a comprehensive middle layer in the existing networks which is the most popular network optimization applications.Content distribution networks (CDNs) promise to resolve such problems,by moving the content to the edge of the Internet,close to the end-user. With the key content outsourced,the load on the origin server is reduced,the connection from a local content delivery server is shorter than between the user and the origin Web server,thus reducing latency,and reducing the response time of users to access the web site.Which contents should be outsourced to CDNs' surrogate severs and how to select these contents became a new problem,Selecting the outsourced contents are critical of CDNs' price and performance.If we consider the dynamic nature of the network,this is a very complex and challenging.Some people think of a cluster-based contents outsourcing strategy,and cluster-based contents outsourcing strategy is the most attractive in the research community.In such a method,Clusters can be identified through the use of traditional data clustering algorithm.However,Web documents and dynamic Web data is not uniform model,the efficiency of these methods is not satisfactory.Moreover,Most of the algorithms need to use administratively tuned parameters (maximum cluster diameter, the largest number of clusters,etc.) to determine the number of clusters generated,but neither the number or the diameter of the clusters can ever be known.Unlike the above-mentioned idea,according to the Web server content structure,We consider every cluster as a collection of a number of related or similar web pages.It reflects the dynamic and non-uniformity nature of network,reveals the relationship of web data.Web page clusters could be identified by complex network clustering algorithm, these communities are perfect unit of objects to be outsourced.Network community structure is one of the most fundamental and important topological properties of complex networks,within which the links between nodes are very dense,but between which they are quite sparse,One of the aims of Complex network clustering algorithm is to identify the cluster structure of complex network.In accordance with the basic solving strategies adopted by the algorithm,Complex network clustering algorithm can be divided into optimization-based methods and heuristic methods.Local-based method and spectral methods are two major optimization clustering method of complex networks.We introduce HITS algorithm and maximum flow clustering algorithm and the CPM algorithm three kinds of optimization based clustering algorithm.Then describe two kinds of heuristic clustering algorithm in detail,C3i algorithm and CiBC algorithm.These two algorithms can identify the network community structure base on the connections among the network.In order to identify the web site community structure,We have carefully studied a fast network clustering algorithm(FNCA),the algorithm's idea is that from the local point of view, assign each node an agent,By making each node of the agent as much as possible to maximize its local function in order to achieve the purpose of optimizing the function Q,thus replace the global view of maximize the function Q method.the meaning of function Q is the actual number of connections within a cluster difference with the expect number of connections within the cluster in the case of random connections.This is used to express the quality of the network cluster structure quantitatively.This algorithm convert the global view to local one,change the function Q into Computing function f of each node in the network,then sum them.The algorithm to maximize Q from a global point of view function into from the local point of view,make the agent of each network node as much as possible to maximize its function f,so as to achieve the purpose of optimize function Q.This algorithm first defined each network Agent as a cluster,each Agent in accordance with its neighbor nodes update their own label in each iteration,so that make the values of function f greatest.When the trends of agent's labels are stabilized,the algorithm is ended.Because any node of the network and some of its neighbor nodes may within the same cluster,or its own become a cluster.Therefore,each iteration of the process of the algorithm, each agent only consider the their neighbours' information to update their own labels.Experiments show that the heuristic algorithm has good clustering accuracy and clustering speed.In order to identify the Web communities,and outsourcing them in unit of these communities,We use the Fast Networks Clustering Algorithm in CDNs.we compare the FNCA algorithm with CiBC and CPM (Clique Percolation Method)through experiment,in different assortativity factor,we compare the number of communities identified by the three kinds of algorithm with the number of communities we generated,FNCA algorithm can identify the number of communities accurately.using FNCA algorithm to identify web communities and outsourced them to the surrogate server,enables users to have a shorter mean response time.comparing the FNCA algorithm with CiBC algorithm and CPM algorithm,the FNCA is more suitable to be as a content selection algorithm.using the FNCA algorithm as outsourcing strategy compared with the CPM algorithm, the hit rate of the surrogate server greatly increased.In different case of communities' density,we compare the clustering speed of the FNCA,the CiBC and the CPM,experiment shows that the FNCA algorithm is the fastest,far superior the CPM algorithm.we examined outsourcing algorithms by computing the similarity distance measure between the resulted communities(identified by FNCA,CiBC,and CPM) and the ones produced by us,we can see that the FNCA algorithm has the high clustering accuracy. By analyzing the experimental results we got a conclusion that the FNCA algorithm is a high speed and high precision complex network clustering algorithm. Considering the FNCA algorithm as a CDN content distribution strategy can identify the Web pages clusters on a web server better and faster for content distribution. From the perspective of CDN provider,using the FNCA algorithm as content distribution strategy,CDN provider should set up some surrogate servers over the networks.Relative to the set up full mirror servers,It Save the Cost greatly,and also reached the purpose of improving network performance.This is very meaningful.From the users' point of view,using the FNCA algorithm as the contents outsourcing strategy,It can reduce the waiting time wen users access the content they needed,and greatly reduce the average response time of users' requests.
Keywords/Search Tags:Content Delivery Networks, Cluster, Complex Networks Clustering Algorithm, Community, FNCA
PDF Full Text Request
Related items