| In the era of big data,machine learning is rapidly developing thanks to massive data and abundant computing resources.With the enactment of data privacy protection regulations,it is difficult to exchange data directly between individuals or enterprises to train machine learning models based on large-scale data.Therefore,federated learning has emerged as a key technique to protect data privacy in distributed machine learning,aiming to federate data from multiple parties for computation without private data leakage.Vertical federated learning is one of the important research directions of federated learning,which can ensure the joint training of machine learning models among different institutions and be applied in many fields such as Internet financial lending and medical diagnosis.Data privacy protection is one of the main goals of Vertical federated learning,which ensures that the participants do not extrapolate the local private information of others by transmitting data.At this stage,vertical federation learning supporting data privacy protection faces many challenges:(1)high encryption cost for multiple participants.As the number of participants increases,the encryption cost grows exponentially and cannot support multi-participant training scenarios.(2)High communication cost of multi-participant distributed models.The process of transmitting parameters by participants consumes a lot of network bandwidth and communication time.(3)The multiparticipant model lacks incentives.Incentives could encourage participants to share data or models to collaborate on training.To address the problems of high encryption cost,inefficient communication and lack of incentive mechanism in Vertical federated learning,this paper improves Vertical federated learning in terms of privacy protection,communication efficiency and incentive mechanism,specifically:(1)To address the problems of high encryption cost and low computational efficiency in traditional privacy protection methods,we propose a privacy protection method of Vertical federated learning based on secret sharing,i.e.,each participant uses pseudo random values to mask their own local data.This method solves the scenario that the traditional homomorphic encryption algorithm takes a long time to encrypt and decrypt and supports multiple participants.In this paper,we demonstrate the security of this method with theoretical analysis.(2)To address the problem of inefficient model communication,an inert aggregationbased asynchronous Vertical federated learning communication acceleration method is proposed.The method adaptively skips a portion of slow-changing data communication and shortens the waiting loss time of participants,reducing communication bandwidth and easing the computational pressure on the server.While reducing the communication overhead,this paper theoretically demonstrates that the method outperforms traditional gradient descent methods in terms of accuracy.(3)To address the problem that it is difficult to motivate participants to participate in training without a revenue distribution mechanism,a Vertical federated learning incentive mechanism based on the kernel selection mechanism is proposed.The method designs a data importance assessment method to screen high-quality users and a frugal payment allocation algorithm to minimize the total platform expenditure.This paper conducts Vertical federated learning simulation experiments based on public datasets such as Boston house price forecast and compares traditional encryption mechanisms such as Paillier homomorphic encryption,differential privacy,and secret sharing methods,and the experimental results show that the computational efficiency of this method is significantly improved and the accuracy of training results is improved by 5%-10%.To verify the efficiency of the communication acceleration method,the paper conducts communication efficiency comparison experiments on data sets such as cancer prediction,and the results show that the method improves the model training convergence speed and reduces the amount of communication and the number of iterations to shorten the communication time without losing accuracy.Experiments on payment allocation based on scenarios such as disease classification in hospitals and deposit prediction in banks show that the platform provides minimal rewards to motivate high-quality participants,and the total payment cost is reduced by approximately 10%-23%This paper also designs a Vertical federated recommendation system and conducts federation recommendation simulation experiments based on datasets such as movie ratings to verify the security and efficiency of the model’s privacy-preserving approach and communication acceleration algorithm. |