Research On Key Technologies Of Data Security And Privacy Protection In Deep Learning

Posted on:2022-02-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:F Y Tang

Full Text:PDF

GTID:1522306845450694

Subject:Army commanding learn

Abstract/Summary:

PDF Full Text Request

As a representative technology of artificial intelligence,deep learning has become an important approach for analyzing and processing massive amounts of high-dimensional complex data in the era of big data,especially multilayer perceptron,convolutional neural networks and generative adversarial networks.In recent years,deep learning has achieved excellent performance in areas such as image classification,speech recognition,natural language processing,intelligent recommendation and malicious code detection,which leads to its wider and wider application in automated driving,computer vision,smart home,e-commerce and information security.But an endless stream of deep learning se-curity issues are alarming to its researchers and users.One of the most prominent security issues is that the important data assets in deep learning systems are often coveted by at-tackers and need to be secured.These data assets mainly consist of training data sets,trained models,inputs and outputs of model inference.As deep learning has some inher-ent weaknesses that allow attackers to exploit them and develop various attack methods including poisoning attack,model inversion attack,model extraction attack and adver-sarial attack,deep learning systems are vulnerable,which to some extent hinders their widespread application and extension into critical areas,such as the military.Moreover,the existing deep learning data security and privacy protection technologies are not sat-isfactory enough to reassure the users of deep learning systems.This paper addresses the above issues in order to protect the critical data assets of deep learning systems.Fo-cusing on the security threats faced in the three key phases of deep learning: training data collection,model training and model inference,based on the theoretical foundation of cryptography,data security and privacy protection technologies,this thesis proposes flexible and efficient deep learning data security and privacy protection solutions from the perspectives of data owner privacy protection,federated learning participant data se-curity,input data and output security as well as model security.To draw a conclusion,the main findings and contributions of this thesis are as follows:1.A deep learning training data generation technology to protect the privacy of data owners is proposed to address the problems of lack of labeled high-quality data and data owners’ concern about data privacy leakage during the deep learning training data collec-tion process.This technical solution uses the conditional generative adversarial network as the main body to generate data,which meets the demand of generating a large amount of labeled training data,and combines the data morphing method to achieve data privacy protection and solve the problem of data privacy leakage.In addition,in order to com-pensate for the loss of data availability due to data morphing,a convolutional layer aug-menting method is designed.Thus,the solution protects data privacy and maintains data availability while generating large amounts of labeled training data which are not limited by the original dataset size of the data owner.A detailed security analysis shows that the proposed scheme can guarantee data confidentiality.According to the experimental simu-lation results,most of the operations introduced by data morphing and convolutional layer augmenting are performed by the data owner to protect data privacy,but with high com-putational efficiency.By comparing with other data generation schemes,the proposed scheme has advantages in data availability and data privacy protection,which can meet the security requirements and practical needs for training data during the data collection phase of deep learning.2.In order to address the problems of data usability verification difficulties,low mo-tivation of data providers,risk of data privacy leakage and lack of fairness of data transac-tions in the deep learning training data collection process,a secure and fair deep learning training data collection technology with incentive mechanism based on blockchain,differ-ential privacy and ring signature is proposed.The technical solution allows data owners to provide their data for a fee,i.e.,to enjoy deep learning prediction services by virtue of the points gained from providing data,so the solution can be regarded as a data trading solution.In addition,in order to protect the privacy of data owners,before providing data,data owners add controlled Laplace noise to their sensitive data for differential privacy processing.And in order to achieve sustainable development,special attention is paid to the fairness of data transactions,i.e.,the quality of data provided by data owners is evalu-ated and monitored,and the actions of data consumers in the purchase and use of data and providing deep learning services to compensate data owners can be tracked.The formal proof of security and the comprehensive comparison of scheme properties demonstrates that the scheme presented is secure and practical.3.In order to solve the problems of data privacy leakage risk and the need for sep-arated secure channels of the federated learning participants to communicate with the parameter server during deep learning model training,a deep federated learning model training technology is proposed to ensure the data security of participants even when one of them is in collusion with a parameter server.The technical solution introduces a key transforming server and applies homomorphic re-encryption to the asynchronous stochas-tic gradient descent process in model training.The security analysis and experimental simulations show that the proposed scheme achieves more security properties than the original federated learning scheme,although its communication overhead is higher,but the computational overhead is similar for each federated learning participant.Overall,the scheme is a safer and more accurate training scheme for distributed deep learning.4.In order to solve the problems of input data,output results and model parameters being at risk of leakage in the deep learning model inference process and the possibility of workers being framed,the author propose a safe and fair deep learning model inference technology for both data and model preservation.The technical solution first builds a deep learning model inference service system with three workers based on secure three-party computation.Then,the author propose a method to generate commitment certificates for the input data,model parameters,intermediate computation results,and final output results to achieve fairness.That is to say,cheating workers will be detected and cannot deny,and honest workers will not be framed because their honest behavior is publicly ver-ifiable.Thus the scheme protects the privacy of the user’s input data,inference results and the security of the server’s pre-trained model,and the scheme is fair to its participants.In short,the scheme is a safer and fairer solution for providing deep learning model inference services.

Keywords/Search Tags:

Deep Learning Security, Data Security, Privacy Preserving, Training data Generation, Data Collection, Model Training, Model Inference

PDF Full Text Request

Related items

1	Railway Data Security Governance System And Privacy Computing Technology Research
2	Research On Secure Generation Of Electricity Consumption Data In Power Grid Based On GAN
3	Research & Development Of A Secondary Equipment Data Platform For Training Simulator For Substation
4	Research On Privacy-preserving Data Aggregation Scheme In Smart Grid
5	Research On Security Preserving Methods Of Smart Home Devices’ Application And Data
6	Research On Security And Privacy-preserving In Internet Of Vehicles
7	Research On The Method Of Improving The Quality Of Flight Training Data
8	Research On Data Security Aggregation Scheme Supporting Privacy Protection In Smart Grid
9	Research On Source-Side And Load-Side Scenario Generation Method Considering Data Privacy-Preserving
10	Research On Data Privacy Protection Of Smart Grid Data Based On Blockchain