Research On Value Reinforcement Learning Based On Generalized Fixed Points

Posted on:2024-02-15

Degree:Master

Type:Thesis

Country:China

Candidate:Y Z Lv

Full Text:PDF

GTID:2558307136495234

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,reinforcement learning has become a paradigm for solving sequential problems.In the face of large-scale or continuous problems,a widely used technique for reinforcement learning is value function estimation,the accuracy of which will directly affect the effect of reinforcement learning.Linear value function estimation is one of the methods of value function estimation.However,in practical application,there are some problems in the estimation of linear value function,so the accuracy of linear value function estimation is still a problem to be solved.In order to solve these problems,the fixed point perspective is introduced into linear value function estimation.From this perspective,the problem of linear value function estimation can be transformed into a problem of finding fixed points,so that more accurate value function estimation can be obtained.However,the existing fixed point solutions of reinforcement learning are not optimal.At the same time,the solution of each fixed point has its own defects and deficiencies.What kind of fixed point solution of reinforcement learning is better and how to express and approach the optimal solution are the two main problems that reinforcement learning has to face up to now,and also the problems that this paper intends to solve.In view of this,this paper for the above two problems for in-depth exploration.The main work and contribution of this paper are as follows:1.In order to solve the problem of what kind of reinforcement learning fixed point solution is better,this paper proposes the model design of generalized fixed point solution,which mainly has two contributions,namely the extension of fixed point solution based on n-step bootstrap method and the construction of fixed point solution based on linear interpolation method.At the same time,this idea is applied to mature CBMPI algorithm framework,and CBMPI(n,β)algorithm based on generalized fixed point is proposed.2.Aiming at the problem of how to express and approximate the optimal solution,the parameter optimization of generalized fixed point solution based on Bayesian optimization and higher quality solution based on ensemble learning are proposed,hoping to approximate the optimal solution and find a better sub-optimal solution.3.The effectiveness of our proposed algorithm is verified in the classic Tetris game environment.And we compared with the method recorded in the literature.

Keywords/Search Tags:

Linear value function estimation, Fixed point, Bayesian optimization, Integrated learning, The game Tetris

PDF Full Text Request

Related items

1	Study On Fast Fixed Point Algorithm And Its Application Based On Compressed Sensing
2	A Bayesian decision theoretical approach to supervised learning, selective sampling, and empirical function optimization
3	Receiver’s Algorithm Design And Fixed-point Simulation In 802.11ax
4	Fixed-Point Inference Of Neural Image Compression
5	Fixed Point Realization Of Channel Estimation Algorithm Based On FFT For OFDM Systems
6	Fixed Point Realization Of Channel Estimation Algorithm Based On Fft For Ofdm Systems
7	Research On MPEG-4 AAC Encoding Technology And Implement Of Fixed-point DSP Program
8	Multistability Analysis Of Recurrent Neural Networks With Generalized Piecewise Linear Activation Functions
9	Research On Rigid And Non-rigid Point Set Registration Based OnIterative Linear Optimization
10	Research On Image Integrity Authentication Based On Fixed Point Theory