Research On Reinforcement Learning Algorithms For Complex Problems

Posted on:2022-05-05

Degree:Master

Type:Thesis

Country:China

Candidate:F Y Liu

Full Text:PDF

GTID:2518306323462434

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Reinforcement learning is a type of machine learning method that is recognized as one of the key techniques to achieve artificial general intelligence.With the increasing complexity of problems in real-world application scenarios,the study of efficient rein-forcement learning algorithms has received increasing attention.On the one hand,in order to solve complex problems,reinforcement learning methods often use deep neural networks as the representation of policies and value functions,resulting in non-convex and non-smooth optimization problems,which make derivative-based reinforcement learning methods easily fall into local optimal solutions,while derivative-free reinforce-ment learning methods can avoid this problem,but their sample efficiency is extremely low when the problem dimension is high.Therefore,how to improve the sample ef-ficiency of derivative-free reinforcement learning methods is worth being studied.On the other hand,many complex real-world problems often contain multiple different,and possibly conflicting objectives.But most reinforcement learning methods assume only a single objective,and the only multi-objective reinforcement learning algorithms cur-rently available have many problems such as poor scalability,time-consuming training,and poor quality of the Pareto set obtained by the algorithm.Therefore,how to effec-tively improve the shortcomings of multi-objective reinforcement learning methods is also worth being studied.The main work of this paper includes:(1)Proposed an ES-based derivative-free reinforcement learning algorithm SGES to solve high-dimensional reinforcement learning problems.This paper aims to address the problem that the gradient estimator in the derivative-free reinforcement learning representative algorithm ES has a high variance leading to its low sample effi-ciency.This paper proposes the SGES algorithm to effectively reduce the variance of the gradient estimator by using the historical gradient estimations constructs the gra-dient subspace and its orthogonal complementary space.This paper demonstrates that the variance of the gradient estimator in the SGES algorithm can be much smaller than that of the gradient estimator in the ES algorithm.Experimental results verify the con-clusion of the theoretical analysis and also show the superiority of the SGES algorithm compared with other algorithms.(2)Proposed a multi-objective reinforcement learning algorithm based on meta-learning,PG-Meta-MORL,to solve multi-objective reinforcement learning problems.This paper proposes the PG-Meta-MORL algorithm,which models the multi-objective reinforcement learning problem as a meta-learning problem.The PG-Meta-MORL algorithm optimizes a meta-policy iteratively using multiple tasks ob-tained from the selection of a fitted prediction model that guides the entire optimization process in the direction that can best improve the quality of the current Pareto set.Exper-imental results show that the PG-Meta-MORL algorithm not only finds a high-quality approximated Pareto set,but also quickly adapts to newly given objective preferences.

Keywords/Search Tags:

Reinforcement learning, Derivative-free optimization, Evolution strategies, Multi-objective optimization, Meta-learning

PDF Full Text Request

Related items

1	Research On Theory And Algorithms In High-dimensional Non-convex Derivative-free Optimization
2	Research On Meta-Learning Methods Towards High Sample-Efficiency Reinforcement Learning
3	Research And Application On Constrained Multi-objective Evolutionary Algorithm Based On Reinforcement Learning
4	Research On Efficient Derivative-Free Automatic Machine Learning
5	Derivative-free Optimization Method Of Convolution-based Models For Few-shot Learning
6	Research On Multi-Model Multi-Objective Evolution Algorithm With Integrated Double Spaces Learning
7	Research On Theory And Algorithms In Multi-Objective Evolutionary Learning
8	Research On Evolutionary Multi-objective Optimization Algorithm And Its Application
9	Research On Multi-objective Dynamic Differential Evolution And Its Applications
10	Research On Multi-Objective Optimization Method Of Edge Cloud Task Scheduling Based On Deep Reinforcement Learning