Font Size: a A A

Research On Distributed Online Algorithm Via Bandit Feedback

Posted on:2022-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:X M ZhuFull Text:PDF
GTID:2480306530959719Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
In recent years,with the increase in the scale of data and the popularization of high-speed streaming generation methods,algorithms based on distributed optimization problems have achieved tremendous development in both theory and application.Many algorithms based on this framework have been designed and successfully applied in practice.However,with the explosive growth of data scale,centralized optimization algorithms are difficult to solve large-scale optimization problems due to the limitation of the computing bottleneck of a single machine.The distributed mechanism of multicomputer cooperation can greatly reduce the computational burden of a single computer.At the same time,in a distributed network,through mutual coordination and cooperation between nodes,it can effectively solve large-scale practical problems such as smart grids and sensor networks,and can improve data transmission efficiency and enhance network robustness.However,in practical applications,distributed networks generally run in a dynamic environment.Traditional batch learning algorithms are time-consuming to process massive amounts of data,while online learning has the characteristics of updating models in real time and can dynamically adjust models according to data changes.Which can more efficiently complete the processing of a large amount of real-time data,and has demonstrated important application value in machine learning,online recommendation systems and resource allocation.However,in actual situations,such optimization problems that gradient information cannot be directly obtained or are difficult to obtain occupy a very important position in distributed online optimization.Therefore,it is of vital significance to study such problems.This thesis focuses on the related algorithms and convergence results of a class of distributed online optimization problems.The rest of the thesis is arranged as follows:In the first chapter,introduces the relevant background knowledge needed in this thesis,and briefly summarizes the main research content and innovations of this thesis.In the second chapter,consider a distributed online optimization problem that is difficult or impossible to obtain under an undirected graph,that is,the Bandit problem.Using Bandit feedback technology,the Bandit distributed online algorithm for this problem is designed,and the relevant convergence analysis is given.At the same time,numerical simulation experiments prove that the algorithm is effective.In the third chapter,considering that under the directed graph,the existing distributed online algorithms are not allowed to be applied to some Bandit problems based on propagation networks.Using Bandit feedback technology and the nature of row randomness,an algorithm for this problem is designed,and the relevant convergence analysis is given.Finally,a numerical example is given to prove the effectiveness of the algorithm.In the fourth chapter,consider the Bandit problem that cannot be handled by the existing distributed online algorithms in the dynamic communication graph scenario under the time-varying directed graph.Using Bandit feedback technology and time-varying random properties,the algorithm of this problem is designed,and the relevant convergence analysis is given.Finally,a numerical example proves that the algorithm is effective.In the fifth chapter,summarizes the research of this thesis and makes a prospect for the follow-up research work.
Keywords/Search Tags:Multi-agent network, Distributed optimization, Online optimization, Mirror descent algorithm, Dual average algorithm, Bregman divergence, Bandit feedback, Smoothing function
PDF Full Text Request
Related items