Font Size: a A A

Research On Privacy-Preserving Vertical Federated Learning Algorithm For Logistic Regression And Its Application

Posted on:2023-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:H Z SunFull Text:PDF
GTID:2568306914964839Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The development of artificial intelligence technology requires the support of massive data.However,as an important resource in various industries,data is often held by multiple enterprises,and usually contains a large amount of sensitive personal information.Directly sharing or using this data may lead to serious privacy leakage.To alleviate concerns about privacy leakage,Google proposed federated learning,a distributed machine learning paradigm,which allows parties to keep data locally while training models jointly.Among them,vertical federated learning refers to the situation where datasets of each party have the same sample space but different feature spaces.It has many applications in finance and recommendation,and has become a focus in academia and industry.Logistic regression is a simple and efficient machine learning model,and research on vertical federated learning algorithm for logistic regression has received extensive attention recently.However,the existing work faces problems such as low model utility and potential privacy leakage risk,which limit its practicality and feasibility.To solve these problems,this thesis conducts research on the privacy-preserving vertical federated learning algorithm and its application for logistic regression.First,this thesis proposes a privacy-preserving vertical federated model learning algorithm for logistic regression.In this algorithm,the model update strategy based on Mini-batch SGD is designed to improve the iterative rounds of the algorithm while ensuring privacy protection;the label protection scheme based on parameter encryption is designed to provide stronger privacy protection for label data from active party.In addition,this thesis proposes a vertical federated model publishing algorithm for logistic regression that satisfies differential privacy,which can alleviate the privacy leakage problem in the model publishing scenario.The experimental results show that the model learning algorithm can effectively improve the prediction performance of the trained model;the model publishing algorithm can ensure the validity of model parameters while satisfying differential privacy.In the field of financial analysis,with the help of logistic regression based vertical federated credit scorecards,different institutions can better evaluate customer credit ratings and improve risk control capabilities.To this end,this thesis proposes a privacy-preserving vertical federated model constructing algorithm for credit scorecard.In this algorithm,a vertical federated feature correlation analysis method based on homomorphic encryption and a vertical federated supervised optimized binning algorithm are first designed to achieve optimized processing of the training data.Second,the proposed vertical federated learining algorithm is applied to a constrained logistic regression step to ensure the validity and interpretability of the final scorecard.The experimental results show that the model constructing algorithm can guarantee the prediction performance and interpretability of the final credit scorecard.
Keywords/Search Tags:federated learning, logistic regression, credit scorecard, homomorphic encryption, differential privacy
PDF Full Text Request
Related items