Design And Implementation Of Federated Learning Algorithms For Heterostructure And Heterogeneous Data

Posted on:2024-04-06

Degree:Master

Type:Thesis

Country:China

Candidate:L J Peng

Full Text:PDF

GTID:2568307079472454

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Data has emerged as a new element in the era of the digital economy,possessing both resourcefulness and strategic importance,and serving as a crucial component of productivity.Machine learning algorithms,as the main drivers of data value extraction,have been widely applied across various industries,yielding significant economic benefits.However,as legal regulations continue to improve,privacy protection measures have become increasingly stringent.The era of consolidating and merging heterogeneous and disparate data into a single dataset for mining purposes has passed.Instead,data within various "data silos" exist independently,meaning that relevant data with correlations exists in the form of heterogeneous or disparate entities that cannot be physically centralized.There is an urgent need to design appropriate solutions based on the specific distribution of data in order to extract its value.Nevertheless,such a demand faces several challenging issues.In the context of heterogeneous data distribution,it is necessary to combine different features from distinct structured datasets located at different physical sites and identify their correlations.Traditional centralized recommendation algorithms lack continuity and cannot operate in a distributed environment,raising concerns regarding data privacy.Additionally,in scenarios with highly heterogeneous data distributions,improving algorithm performance and convergence speed becomes crucial.This thesis addresses these challenges by integrating relevant theories such as federated learning and knowledge distillation.The main contributions of this work are as follows:This thesis applies the Deep Factorization-Machine(Deep FM)algorithm to the environment of distributed storage for heterostructure data and enhances it with a vertical federated learning approach.A detailed analysis of the calculation structure of the Deep FM algorithm is conducted,and based on the requirements of distributed computing environments,the original calculation structure of Deep FM is redesigned.Privacy protection measures based on confusion and differential privacy are employed to ensure the privacy of intermediate data.The performance of the improved algorithm is comparable to that of the original centralized algorithm.Furthermore,this thesis combines the federated distillation technique from horizontal federated learning to address the challenges posed by heterogeneous data.It reduces the demand for encryption algorithms during training by leveraging public datasets for distillation and solves the privacy issues related to intermediate data.The factor-transfer technique is introduced to utilize the intermediate knowledge within local models,employing interpreters and translators to transfer knowledge factors,thereby improving the convergence speed of the algorithm.Based on this,the distance between global convergence points and local convergence points is determined by considering the local training losses of each node.By adjusting the learning rate of local model training,the global convergence point can be adjusted,ultimately finding a relatively fair convergence point for all nodes,thus effectively improving algorithm performance in highly heterogeneous distributed storage environments.And,this thesis designs and implements a federated learning system in the Java environment.The federated learning system provides support for both horizontal and vertical federated learning,offering interfaces for training control,logging,and calculation structure construction,among other functionalities,and enabling secondary development.development.

Keywords/Search Tags:

Federated learning, Machine learning, Knowledge distillation

PDF Full Text Request

Related items

1	Federated Learning Based On Knowledge Distillation
2	Research On Key Technology Of Ensemble Knowledge Distillation In Federated Learning
3	Design And Implementation Of Personalized Federated Learning Alogorithm Based On Multipe Aggregation Servers
4	Personalized Federated Learning Method For Heterogeneous Data
5	Research On Federated Distillation Method Based On Differential Privacy Protection
6	Research On Optimization And Application Of Heterogeneous Data Fusion In Federated Learning
7	Research On Federated Learning Method For CTR Prediction Scenarios
8	Robust And Communication Efficient Multi-Model Solution For Cross-Silo Federated Learning
9	Decentralized Federated Learning Algorithm And Framework Based On Blockchain
10	Research On Personalized Method Of Federated Learning In Heterogeneous Scenario