Research On Federatedlearning Methods For Unbalanced Data

Posted on:2022-06-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liu

Full Text:PDF

GTID:2518306338467124

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the coming of big data era,the data from either the same or different industries can be distilled to produce great value by artificial intelligence(AI)based technologies.Traditional centralized AI techniques gather data and extract information for a special task.However,with the rapid development and application of big data,the security and privacy protection of data have attracted much more attention,and many laws and regulations have been issued to restrict the arbitrary circulation and application of private data,which results in "data islands" problem.In response to this problem,Google proposed federated learning(FL)technique,which is a new machine learning framework to enable various data owners sharing the value of data instead of data via collaborative training under the premise of ensuring data security and data privacy.As a mainstream FL model,Federated Averaging algorithm proposed by Google shares the learned knowledge by calculating model parameters or gradient information,which has the following shortcomings:1)severe performance degradation in case of significant imbalance of object types between the partners;2)security risk by the sharing of model parameters.In this thesis,two improved FL models are developed for the above two problems with severely imbalanced dataset.In the case of severely imbalanced dataset with low privacy requirement on machine learning model,a model parameter sharing-based FL model is designed,which modifies the model training method by iterating the transform of model parameter between partners.Its performance is evaluated on the open dataset and results verify that it approaches the performance of centralized counterpart model.For severely imbalanced data with much higher sensitivity demand on machine learning model,knowledge distillation and transfer-based FL model is developed,which can flexibly support heterogeneous models between partners.Moreover,a fake public data set generation method is designed to solve the insufficient public data problem,which greatly improves the classification accuracy.

Keywords/Search Tags:

federated learning, machine learning, unbalanced and highly sensitive data sets, knowledge distillation

PDF Full Text Request

Related items

1	Design And Implementation Of Federated Learning Algorithms For Heterostructure And Heterogeneous Data
2	Federated Learning Based On Knowledge Distillation
3	Robust And Communication Efficient Multi-Model Solution For Cross-Silo Federated Learning
4	Optimization Method For Federated Learning Model With Unbalanced Dataset
5	Personalized Federated Learning Method For Heterogeneous Data
6	Research On Optimization And Application Of Heterogeneous Data Fusion In Federated Learning
7	Research On Federated Learning Method For CTR Prediction Scenarios
8	Research On Federated Learning With Heterogeneous Data
9	Research On Key Technology Of Ensemble Knowledge Distillation In Federated Learning
10	Design And Implementation Of Personalized Federated Learning Alogorithm Based On Multipe Aggregation Servers