| With the continuous acceleration of the construction of digital society,a large amount of data is collected by enterprises or research institutions,such as patient’s medical record data or personal social network data.There is often a wealth of value in these data,and publishing them can support scientific research and bring great benefits to society.For example,the analysis of patient data released by hospitals by drug research institutions will help to develop drugs for treating intractable diseases.The analysis of residents’ financial data released by banks by government departments can make more targeted adjustments to the deployment of economic structure.However,the data also contains sensitive information of users.If these data are released directly without privacy protection,it will not only bring great privacy troubles to users,but also may cause a series of legal problems.To this end,privacy preserving data publishing methods will be investigated in this paper.Relational data(such as customer physical examination data in medical centers)and graph data(such as patient spatio-temporal communication data in infectious disease hospitals)are two common types of data,which are closely related to scientific research or production and life.The existing privacy-preserving data publishing methods for relational datasets are sensitive to the data dimension size,and the utility of the published data needs to be improved.The existing privacy-preserving data publishing methods for graph datasets cannot simultaneously capture graph structure information and protect differential privacy.In order to solve the above problems,this paper deeply studies the privacy-preserving data publishing problem for two typical data types(relational data and graph data),and has achieved the following results:(1)For the issue of publishing relational data,This paper proposed a relational data publishing method based on Generative adversarial networks and Differential Privacy,which is named DP-RGAN.DP-RGAN extracts the distribution features of relational data based on generative adversarial network and feedforward neural network.Considering that only the discriminator network can access the original data in the training process,the gradient information generated by the discriminator network training is perturbed by differential privacy noise,so as to realize the privacy protection of the data publishing task.In order to reduce the impact of gradient outliers in the gradient clipping process under differential privacy perturbation,this paper proposed a gradient clipping method based on clustering algorithm.In addition,in order to more fully capture the distribution feature information in the original data,a dynamic selection method of multiple generators based on genetic algorithm is proposed.Experimental results show that the DP-RGAN method has a strong ability to capture the distribution characteristics of relational data,and the published data has high utility while ensuring the strength of privacy protection.(2)For the issue of publishing graph data,This paper proposed a Graph data publishing method based on Generative adversarial networks and Differential Privacy,which is named DP-GGAN.DP-GGAN extracts the distribution features of graph data based on generative adversarial network,graph convolutional network and variational graph autoencoder.In order to capture the connection characteristics of nodes and the global graph structure characteristics at the same time,and provide privacy protection for the data publishing method,the Wasserstein distance is used to measure the distribution distance between the generated graph and the original graph obtained by the graph convolutional neural network and the feed forward neural network in turn,and the differential privacy noise is introduced to the sensitive gradient information during the training process to perturb the gradient.In addition,an adaptive privacy budget allocation method is proposed,which flexibly adjusts the privacy budget allocation according to the convergence degree of model training,so as to accelerate the convergence of the model and improve the utility of the final generated data.Experimental results show that the DP-GGAN method has a strong ability to capture the graph distribution characteristics,and the published graph data has high data utility while ensuring the strength of privacy protection.(3)Based on the studied data publishing method,a privacy preserving data publishing system based on generative adversarial networks was designed and implemented.In this system,the data publishing model for relational data and graph data is encapsulated as the core module of the system(data publishing module),which provides a data publishing platform that satisfies differential privacy protection for data publishers and data users.The system provides fast operation pages for data publishers and data users.For data publisher users,the system provides data upload,data preprocessing,data publishing,data management,download rights management and other functions.For data users,the system provides data browsing,data retrieval,data permission application,data download and other functions.Application and test results show that the system has the application value of actual data publishing task scenarios... |