Font Size: a A A

Research And Application Of Clustering Algorithm Based On Data Security And Privacy Protection In The Cloud Environment

Posted on:2023-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:G G GeFull Text:PDF
GTID:2558307046492964Subject:Engineering Cyberspace Security
Abstract/Summary:PDF Full Text Request
With the massive growth of data,society has entered the era of big data,which puts forward higher requirements for data storage,processing,and analysis.The cloud computing platform has the characteristics of strong data storage capacity,rich computing resources,high virtualization,and high availability,which greatly improves the efficiency of data mining.Therefore,the big data mining system relying on cloud computing has developed rapidly.However,the cloud computing environment is complex and changeable,and cloud service providers and data owners may not be in the same trust domain,which leads to the problems of data security and privacy leakage when mining big data using clustering data mining algorithms.In response to this problem,this thesis proposes a hybrid encryption scheme to ensure data security,adds a privacy protection mechanism to the k-prototypes clustering data mining algorithm to improve data privacy,and applies it to educational big data clustering analysis.The main innovations and work of this thesis are as follows:(1)A new hybrid encryption scheme is proposed for data security problems in a cloud computing environment.The scheme determines the sensitivity level of data by establishing a data sensitivity level model and adopts different encryption methods for sensitive data of different levels.For sensitive data,the Cha Cha20 symmetric key is encrypted first,and then the ECC algorithm improved by the window-based Non-Adjacent Form(NAF)algorithm is used to encrypt the symmetric key.The final experimental results show that the hybrid encryption scheme can improve the efficiency of data encryption while ensuring the security of data in the cloud environment.(2)Aiming at the possible privacy leakage problem in the process of mining big data using the k-prototypes clustering algorithm,a privacy-preserving parallel differential identifiable kprototypes clustering algorithm is proposed.The algorithm improves the calculation method of dissimilarity through information entropy,combines the index mechanism and combination property of differential identification,and realizes the privacy protection of cluster data in a cloud environment based on the Map Reduce framework.In order to verify the usability of the algorithm,the evaluations were carried out on the adult dataset and the online learning dataset,respectively.The final experimental results show that the algorithm can ensure the availability of the clustering effect and also protect the privacy of the data.(3)With the help of a cloud computing platform,this thesis applies the hybrid encryption scheme and parallel differential identifiability k-prototypes clustering algorithm to a large number of online learning questionnaire data collected by the Guangdong Provincial Department of Education,and clustering is carried out under the condition of ensuring data security and privacy.The final experimental results show that the learning status of students’ online learning has nothing to do with the students’ location and school type,but is related to the learning resources and learning behaviors that can be used during learning.
Keywords/Search Tags:Cloud Computing, Data Security, Hybrid Encryption, Privacy Protection, Data cluster analysis, K-prototypes Clustering
PDF Full Text Request
Related items