| Distributed cloud storage system is an important part of current Internet services,which can meet the requirement of clients for high-availability and high-scalability services.However,how to ensure the data consistency between replicas becomes the key to providing highavailability services for distributed systems.Among them,causal consistency has become a research hotspot in data consistency issues due to its performance advantages.With the increasing amount of network data,the traditional causal consistency scheme based on distributed cloud storage architecture can no longer meet clients’ needs for high-quality and high-performance services.Therefore,it is particularly important to change the causal consistency infrastructure to expand it’s application environment and make it more adaptable to complex and changing network conditions.In addition,the timestamp of tracking causal relationship and the way of synchronizing updates between replicas in the traditional causal consistency model will lead to higher management and transmission costs of metadata.In this regard,this paper proposes the following two innovative schemes:(1)This paper proposes a causal consistency model of edge storage based on hash ring and partial geo-replication.The model maps the keys and servers on the hash ring for grouping through two hashes,and stores the subsets of the complete dataset in the replicas located at the edge of the network,thus realizing partial geo-replication in the edge storage environment and reducing the operation delay.At the same time,a combined timestamp is generated and maintained according to the update type to capture the causal relationship,which reduces the overhead of system manages metadata and improves the system throughput.(2)This paper proposes a causal consistent model of edge-cloud collaborative based on grouping protocol.The cloud data center is partitioned and the edge nodes are grouped through the distributed hash table,and the subsets of the complete data set is stored in the replicas at the edge of the network,thereby realize partial geo-replication in edge-cloud collaboration environment.At the same time,we design a group synchronization algorithm called Imp_Paxos,so that the update only needs to be synchronized to the main group,which reduces the visibility delay of the update and decreases the data synchronization overhead.Besides,a sort timestamp is proposed in this paper,which generates different timestamps according to the type of update to track causality.The proposed model reduces the overhead of metadata for system management and communication,and improves throughput quantity of system.Therefore,it can better adapt to edge storage environment and low-latency demand applications. |