| Nowadays,the optimization and industrialization of deep learning technology have entered a stage of rapid development.The efficiency of deep learning models depends heavily on a large number of high-quality training data,and the quantity and quality of training data determine the lower limit of the application effect of deep learning technology.Federated learning is intended to protect the privacy of user and organizational data,providing a multiparty joint machine learning framework that achieves joint modeling by exchanging model gradients instead of local data.Gradient leakage data attacks can reconstruct local data through the shared gradient of participants,which brings new challenges to the privacy protection problem of federated learning.In terms of the attack of gradient leakage data,the current attack research is difficult to apply to practical scenarios,such as the inability to reconstruct the target sample label with high repetition rate,the target sample feature with batch size greater than 48,and the obvious difference between the reconstructed sample and the target sample.In terms of defense against gradient leakage data,gradient perturbation mechanisms such as commonly used differential privacy to ensure that the target sample is not leaked,the added noise intensity has too much impact on the test accuracy of the model,and it is difficult to balance the privacy protection and model usability.To address the above issues,this paper proposes the following new approaches to both attack and defense.(1)In terms of the attack of gradient leakage data,this paper proposes a novel method of End-to-End Gradient Inversion(E2EGI).This method constructs the constraint relationship between labels and gradients,and proposes a new label reconstruction algorithm that only relies on gradient information,which can achieve 81% label reconstruction accuracy in scenarios with label repetition rate of 96%,which is 27% higher than the existing methods.The regularization based on the optimization of the minimum loss combination designed in E2 EGI combines the sample combination with the least gradient difference from multiple sets of reconstructed samples with different initialization,and modifies other reconstructed samples to reconstruct the target sample with higher similarity.The distributed gradient inversion algorithm designed based on the distributed machine learning idea can realize gradient attacks with batch sizes from8 to 256 on the deep network model Res Net-50 and Image Net datasets.(2)In terms of defense against gradient leakage data,this paper proposes a label-based defense method.By reducing the rank of the coefficient matrix of the parameters,gradients,and non-homogeneous linear equations of the input sample,the difficulty of solving the correct input is increased,and the designed experiment also demonstrates that the label of the input sample plays a key role in the success of the gradient attack.A new method for special operations on label duplication and order is proposed,which achieves the least impact on the test accuracy of the model under a similar privacy protection effect compared with the differential privacy and rank analysis indicators,such as the test accuracy in the model training task of Res Net-18 is only 2.3% reduced,while other methods will have at least 9.78% reduction.This paper focuses on the research of gradient leakage data attack and defense in federated learning,which can support the design of a more secure federated learning framework. |