Font Size: a A A

Research On Regression Analysis Algorithm Based On Differential Privacy

Posted on:2017-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:H Z ZouFull Text:PDF
GTID:2350330488472327Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology,massive data analysis and data publishing applications triggered a research boom,regression analysis of actual applications are increasingly widespread.What's the most important challenge now is how to protect the privacy of data regression model parameter to avoid sensitive information disclosure.Differential privacy as an emerging privacy model,not only can prevent any background knowledge of the attack,but also can avoid the distortion data while protecting data privacy.For immediate release model parameter regression model leading to information disclosure data dataset,this paper focuses on the differential privacy applied to the analysis of linear regression model and logistic regression model.The Research On Regression Analysis Algorithms Of Differential Privacy consists of the following three elements:1.It describes the traditional privacy model: k-anonymity,l-diversity.By way of examples given illustrate the advantages and disadvantages of these two models,introducing?-difference privacy protection,noisy mechanism,nature,protection framework and evaluation.2.Researchs on linear regression model of differential privacy exist some problems,for example larger sensitivity and noisy.To solve this problem,this paper presents a differential privacy budget allocation algorithm Diff_LR,the algorithm firstly put objective function into two sub-functions,and then calculate the sensitivity of the two sub-functions,allocate reasonable privacy budget,and a add noise to two sub-functions according to Laplace mechanism,then combined two sub-functions after adding noise,and then strike the optimal linear regression model parameters.Theoretical proof Diff_LR meet ?-differential privacy,through experimental analysis can be concluded that: Diff-LR not only reduces the sensitivity,but also noise reduction add,leading that the linear regression model has higher prediction accuracy.3.Researchs on logistic regression model of differential privacy also exist some problems: logistic regression models for prediction accuracy is low.This paper proposes a Diff_Gene algorithms.The algorithm combines the principle of genetic algorithm,firstly allocate reasonable and different privacy budget to selection from several candidate parameter in erery evolution,and then use the exponent mechanism to pick a top-k optimum parameters,compare the highest fitness function parameters,through continuous iteration,find the best logistic regression model parameters.Experimental analysis proves,Diff_Gene algorithm has a better effect in the model prediction accuracy.In summary,the main contribution of this paper is the research of the differencial privacy for linear regression and logistic regression models presents new algorithms to improve the prediction accuracy of the regression model,while protecting the privacy,avoid data distortion.
Keywords/Search Tags:differential privacy, linear regression model, logistic regression, model parameters
PDF Full Text Request
Related items