| The improvement of data privacy protection laws makes it difficult for many companies to directly collect user data to train models.Federated learning protects privacy by aggregating the locally trained models of the client,where the data can be kept locally.In order to get a highperformance global model,the client and server need to communicate many times,resulting in huge communication cost.Addressing this challenge,this thesis studies the methods to improve the communication efficiency of Federated learning from two aspects: reducing the communication rounds and reducing the traffic in federated learning.In order to reduce the communication rounds in Federated learning,this thesis studies Norm-Normalized algorithm and improves it.Norm-Normalized algorithm accelerates the convergence by amplifying the updated norm of the global model.However the client data is not independent and identically distributed(Non-IID),there is a certain deviation between the update direction of the global model and the ideal direction,and the effect of enlarging the update norm in this direction is not necessarily effective.Aiming at this problem,this thesis introduces federated norm-normalized with client gradient modification(Fed NNCGM)algorithm and federated norm-normalized with server gradient modification(Fed NNSGM),so as to solve the problems existing in the Norm-Normalized algorithm.Fed NNCGM reduces the angle between the gradient directions of different clients on the client side,alleviating the huge direction difference caused by Non-IID data.Fed NNSGM corrects the direction of the global gradient by considering the gradient direction of the client not participating in the training on the server,avoiding the gradient information from being forgotten,and effectively alleviating the impact of Non-IID data.The above two algorithms solve the problems existing in the Norm-Normalized algorithm and effectively improve the communication efficiency of federated learning.Next,we propose a Ada STC algorithm based on Fed NNSGM,which reduces the communication rounds and traffic in federated learning.Sparse ternary compression(STC)algorithm uses the same sparsity rate in each layer,which will cause the layer with fewer parameters to lose more information,thus affecting the model performance.In order to solve this problem,we propose Ada STC algorithm.Ada STC adjusts the sparsity rate of each layer adaptively,so that the amount of parameters transferred to layers with more parameters is appropriately reduced,and the amount of parameters transferred to layers with fewer parameters is appropriately increased.In addition,although STC algorithm with good compression performance can effectively reduce the traffic in federated learning,the convergence accuracy and convergence speed can only be similar to Fed Avg algorithm.Because Fed NNSGM has achieved good results in convergence accuracy and convergence speed,this thesis combines Fed NNSGM with Ada STC.It can not only reduce the traffic between the client and the server,but also improve the convergence speed and accuracy of the model.Extensive experiments on MNIST and CIFAR-10 confirm that the Ada STC algorithm based on Fed NNSGM has much better convergence accuracy and convergence speed than STC and Fed Avg under different sparsity rates.When the sparsity rate is 0.1,the Ada STC algorithm based on Fed NNSGM can achieve similar performance to Fed NNSGM and reduce the traffic per round.It has great advantages in improving the efficiency of federal learning communication. |