| Deep Neural Network models have been widely applied in various fields,where the size of the dataset determines the upper limit of model performance.However,concerns about data privacy from relevant organizations and companies have made it increasingly challenging to allow data circulation and sharing.Federated Learning is a distributed machine learning paradigm designed to address the issue of data silos under privacy constraints.By uploading gradients instead of sensitive data,FL securely resolves the problem of data scarcity.However,recent research indicates that FL still faces privacy and security issues,attackers can infer private data from gradients.Additionally,the distributed structure of FL settings results in non-independent and identically distributed data among clients,causing each client to update towards a local optimum and thereby degrading the performance of the federated learning model.To address the aforementioned problems,this thesis makes the following contributions.Firstly,we implement a distributed federated learning system within a Trusted Execution Environment to protect gradients.The system framework consists of local client members,local client leaders,and a central server in a hierarchical structure.To overcome the limitations of the physical memory space of TEE and simultaneously protect privacy,local client members employ an important gradient filtering mechanism to obtain "important" gradients for privacy preservation and upload them to the TEE of the local client leader.Furthermore,to enhance gradient aggregation and maximize the cohesion of important gradients,this thesis adopts a cohesion grouping strategy,grouping together members with the highest cohesion.In order to maintain the original accuracy of the federated learning model,both the nonimportant gradients and important gradients of the group members are uploaded to the server,and the computation process is verified using TEE’s integrity methods;Secondly,this thesis proposes a strategy for global shared knowledge to correct the update direction of local clients.The algorithm introduces different global knowledge transfer loss functions to achieve knowledge transfer throughout the entire training process and prevent local clients from being trapped in local optima.Moreover,this method improves the model’s classification layer to maximize the utilization of global knowledge and simultaneously alleviate the differences in data distribution among clients.Evaluation results demonstrate that the proposed system framework approach reduces computational costs while improving model performance,compared to existing methods,while ensuring data privacy protection. |