| In order to satisfy the needs of social development in the complex data environment,data-driven applications based on information extraction have recently become a novel industry.Meanwhile,with the society’s awareness of personal privacy is becoming stronger,relevant laws and regulations have increasingly restrict the use of private data,which makes the deep learning method based on large-scale data have to consider the balance between data availability and data privacy.To solve this problem,the collaborative deep learning framework based on distributed data environment has become a new research focus.This framework is able to avoid the plain-text collection of private data and the complexity processes of cipher-text,and allows the data holder to complete the model training locally and update the system model via the parameter server’s aggregation.However,the latest research shows that the collaborative deep learning framework is still marred by privacy leakage problems.In particular,this framework is proved to be susceptible to the Generative Adversarial Network(GAN)attack.In addition,we extend the existing collaborative deep learning framework to the scenario of multi-user parallel training,and our research shows that it is still troubled by the GAN attack.Therefore,how to design a collaborative deep learning framework that can resist GAN attack under the serial and the parallel training modes is an urgent problem.Based on the investigation and analysis of the current situation of privacy-preserving in collaborative deep learning,we make the following specific research:(1)To solve the security problem that the existing collaborative deep learning framework cannot resist the GAN attack,we design a novel model-preserving stochastic gradient descent method by using matrix blinding technology for the first time.Then,the privacy preserving collaborative deep learning framework is constructed based on the above method.We introduce the blinding technique to break the local modeling and training process of the GAN attack,which enables our framework to realize the defense of the GAN attack.Moreover,by introducing the user partition and a model pre-training process,we strengthens the initialization process in collaborative deep learning and improves the robustness of the model.Theoretical analysis and experimental results show that our framework can satisfy the greater needs of privacy protection and maintain the training efficiency and model accuracy advantages of the original framework.(2)In view of the problem that the existing framework still faces the GAN attack threat under the multi-user parallel training scenario,this thesis evaluates the risk of privacy leakage under the parallel mode.By combining the parallel stochastic gradient descent method and matrix blinding technique,the model-preserving parallel stochastic gradient descent method is designed.Based on this method,the privacy preserving parallel collaborative deep learning framework is constructed.Furthermore,to deal with the multi-user dynamic training scenario,we introduce the user dynamic coping strategy and parameter weight adjustment strategy.Theoretical analysis and experimental results show that the extended framework can deal with a variety of parallel training scenarios and realize the protection of training data privacy and system model privacy.(3)We regard the collaborative deep learning instance as our reference object,relying on user partition,model pre-training process,matrix blinding,parallel stochastic gradient descent method and other key technology,design and implement the privacy-preserving collaborative deep learning system based on our two privacy preserving collaborative deep learning frameworks.By using Tensorflow framework,the Flask server framework and Keras scientific computing framework,our system supports two kinds of training mode(in serial or parallel)of multiple data sets of deep learning instance.Furthermore,it can provide technical support and guarantee for the vigorous development of distributed data processing at home and abroad. |