| As we all know,K-means clustering is one of the most powerful and common methods in data mining.It can gather similar data items by computing the distance between data items.With the acceleration of informationization,digitization and networking,economic globalization has become an irreversible trend.Data sources in clustering algorithms are becoming more and more diversified,and data security is becoming more and more important.Considering that data can come from multiple parties,the data may contain sensitive or private information about the participant,and if the information is shared among multiple parties,the privacy of the data will not be guaranteed.With the protection of user data and the privacy of the mining results,privacy preserving collaborative data mining can extract the useful data by mining the joint databases of multiple parties.Therefore,how to design a privacy preserving collaborative data mining algorithm becomes an urgent problem to be solved.Semi-honesty model is consistent with the actual scene in many cases and data privacy in the model is ensured by the fact that the various parties must follow the agreement.In order to ensure data privacy,solutions in semi-honesty model are often not feasible in practice because of the high computational cost and communication overhead.Today,with the develpoment of science and technology,more and more enterprises will store data on the cloud,and distributed cloud computing framework for dealing with large data provides a powerful computing power.This paper will improve the efficiency of the algorithm to ensure the feasibility of the algorithm with the help of cloud computing powerful.In view of the performance problems existing in privacy preserving data mining,this paper has carried out the research on the existing privacy preserving data mining algorithms,and proposed an efficient privacy preserving K-means algorithm.The algorithm supports storage outsourcing and computing outsourcing with two data owners and cloud platforms coexisting.Data is stored on the cloud in the form of cipher text,and the cloud platform performs the task of K-means clustering algorithm on the joint data set by interacting with two data owners.We design different security protocols to solve three technical problems in privacy preserving K-means clustering algotirhm which are secure squared euclidean distance protocol,secure minimus out of numbers protocol and secure circuit protocol.The proposed privacy preserving collaborative data mining algorithm in this paper needs to solve three technical problems: ciphertext distance calculation,ciphertext comparison and ciphertext division.Then we can realize the privacy preserving K-means clustering algorithm by appling these security protocols to the clustering algotirhm framework.The paper analyzes the time complexity,the space complexity and communication complexity of the protocol,and proves the security of the algorithm in the semi-honesty model.Besides,it guarantees that at least one party in the parties for the malicious model is secure in the phase of recomputing centroids.Finally,the paper computes the encryption time and time of every party at one iteration,and verifies the feasibility of the protocol. |