| With the development of Internet technology and the advancement of big data analysis technology,social network has brought huge benefits to all walks of life.User privacy is a key issue in the big data industry.Social networks have been associated with this issue since its inception.On the road of data mining and researching in the future,only by focusing on the protection of users’ privacy can we form sustainable research and development.The social network can be represented by a graph,the nodes can represent users the edges can represent relationships.Many network studies can be abstracted into graph-based network research,such as: WIFI tracks,Bluetooth tracks,instant messaging,social networks,and so on.In the process of researching a graph-based network,in order to protect the privacy of users’,the network is anonymized.Through De-anonymization,we can test the effects of anonymous technologies,thereby promoting the development of De-anonymization technologies is a good way to developing anonymization technologies and protecting users’ privacy.First,a de-anonymity model based on network representation and deep learning is proposed for multiple different anonymous data generated from the same original data.Firstly,the corpus of the node is generated by the Random Walk method,and then the node vector is trained by the Skip-Gram method.Finally,the deep learning framework is designed.Through experimental comparison,the model’s de-anonymous ability is well.Secondly,because the initial model still has room for optimization,this paper further explores the network structure,extracts the features of the network nodes,forms the node feature vector,and optimizes the model by using the feature vector of the node.Through experimental comparison,it is found that the effect is 10% higher than that before the improvement.Third,in reality,we do not have a large number of seed node pairs in advance.In order to be closer to the real anonymous data publishing model,it is necessary to study seedless anonymity.In Section 4 of this paper,an algorithm based on K-clique and node features is proposed to implement seedless node de-anonymity.Firstly,the seed node pairs are discovered through K-clique and node features,and then the seed nodes are used to train the model.Finally,the trained models are used to realize the de-anonymization of the remaining nodes.Through experimental comparison,the anonymity of the algorithm is comparable to the de-anonymity based on the seed node.The key point of the algorithm performance is the number of seed nodes. |