| In order to alleviate the current expansion of big data’s demand for storage resources,cloud storage is increasingly used as a storage solution in personal and enterprise applications.At the same time,users’ requirements for cloud storage are not limited to being able to store and read data,but also have higher requirements on the performance and overhead of cloud storage systems.Therefore,how to obtain higher system performance and smaller system overhead is a research hotspot in the field of cloud storage today.Setting up replicas in a cloud storage environment can improve data reliability and system performance.This paper studies replica creation strategies,and proposes a PageRank-based replica creation algorithm and a PageRank-based replica creation algorithm for the failure of placement nodes and system performance degradation under high load.The replica creation algorithm of node clustering is designed and implemented in combination with the replica creation algorithm based on PageRank.The main research contents of this paper are as follows:(1)Aiming at the problem of node failure in cloud storage environment,a replica creation algorithm based on PageRank is proposed.First,the node status is judged according to the size of the node overheating similarity,so as to decide whether to create a replica;then the file with the highest access temperature is backed up by a replica,and finally the replica placement position is determined according to the node status and PageRank value.The experimental results show that the algorithm can reduce the number of overheated nodes in the system and reduce the data access delay.(2)Aiming at the problem of system performance degradation when cloud storage system is under high load,,a replica creation algorithm based on node clustering is proposed.First,the node cluster is obtained by clustering the nodes;then it is determined whether to create a replica according to the node cluster status;finally,the file dependency graph is constructed for the node cluster to obtain the file set that needs to be backed up,and the file set is placed on the requesting node.Mining the file relationship in the node cluster can back up the set of file groups with higher access heat.Experimental results show that the algorithm can reduce the number of replicas in the system and reduce the access delay under high load conditions,thereby reducing the system overhead.(3)The replica management system is designed and implemented.Ordinary users can store and manage data,create replicas and perform replica simulation,etc.;administrators can view system information,data management and user management functions.The function and performance of the system are verified through tests. |