Font Size: a A A

Research On Federated Learning Resource Scheduling Mechanism For Non-iid Data

Posted on:2024-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:H WuFull Text:PDF
GTID:2568307079475384Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the large-scale application of artificial intelligence and machine learning services in various fields,such as the internet,healthcare,finance and insurance,etc.,the resource scheduling problem of federated learning has become increasingly urgent.The objective of this thesis is to optimize the training efficiency of federated learning models through the collaborative scheduling of computing and communication resources in the large-scale spatio-temporal heterogeneous network scenarios,which have data heterogeneity,communication heterogeneity,and computation heterogeneity characteristics.This thesis aims to establish a theoretical and practical bridge for federated learning and provide a theoretical basis and technical means for the practical application of federated learning.Regarding the data heterogeneity issue in federated learning,this thesis focuses on the weighted algorithm and aggregation step during parameter server aggregation.Firstly,a deep analysis of the impact of data heterogeneity on training model parameters is conducted,and the regularity of gradient reflection on data distribution is clarified.Secondly,the label heterogeneity image recognition problem in federated learning is studied,and the Softmax with Cross Entropy in the neural network is analyzed.The reason for the model offset during parameter server aggregation that results in insufficient accuracy or slow convergence is discovered.On this basis,a weighted aggregation algorithm based on parameter update trends is proposed to solve the problem of decreasing training accuracy in the global aggregation model training under Non-IID.The simulation results show that the proposed dynamic weight aggregation algorithm can effectively improve the training accuracy in data heterogeneous scenarios compared to existing federated learning average algorithms.To optimize federated learning in complex scenarios where multiple heterogeneous problems coexist,this thesis focuses on node selection based on multi-objective optimization.Firstly,the impact model of communication,computation,and data heterogeneity during training is established,and the differentiation of training time and data distribution for different nodes in a single round is distinguished.Secondly,the difficulty of node selection in complex scenarios is analyzed,mainly finding a balance between training time and model loss value.Therefore,based on the federated learning architecture,this thesis proposes a dynamic probability selection mechanism based on reinforcement learning to decide the next round’s selection based on the feedback from each round of training.Node selection ensures that most nodes participate,the model converges quickly,and the system has high fault tolerance,thereby further improving the availability of federated learning.The simulation results show that the proposed node selection method can effectively improve the training performance under the coexistence of multiple heterogeneous problems.This thesis focuses on how to solve the problem of low accuracy and efficiency of model training due to data heterogeneity(Non-IID)in the existing federal learning system.two methods are used,namely,optimization of client data selection based on the improvement of aggregation weights and optimization of training node selection using reinforcement learning methods,both of which belong to the category of resource scheduling.the contribution of this thesis is to solve the problem of data heterogeneity by using different resource scheduling methods,which are applicable to various training algorithms.
Keywords/Search Tags:Federated Learning, Non-IID Data, Federated Average Algorithm, Node Selection, Distributed Machine Learning
PDF Full Text Request
Related items