Font Size: a A A

Research On Action Control And Decision Based On Reinforcement Learning

Posted on:2021-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2370330602985483Subject:Control engineering
Abstract/Summary:PDF Full Text Request
In the learning process of human beings,from the ignorance of the beginning,to the development of adaptability in the environment,it is based on the individual's continuous exploration and trial and error in the environment.If this process is placed in the intelligent field such as robots,can the agent also explore the specific environment,just like the learning process of human beings.The branch of reinforcement learning in machine learning will give the agent the ability to explore the environment.However,a single exploration and a large amount of data storage will not only consume time and space,but also have little effect on the intelligent improvement of the essence of the agent.With the continuous innovation and development in the field of reinforcement learning algorithms,on the basis of the original single model known algorithm,it has gradually expanded to the condition of unknown model reinforcement learning algorithm,and formed a variety of based on Q-Learning algorithm Algorithms to solve various machine learning problems.Based on the above background,this article takes Markov's decision-making process-related theory as the starting point,selects the actions and decisions of agents in environmental exploration as the research object,analyzes and compares the existing reinforcement learning algorithms,and combines the actual scene with the background requirements Designed and implemented an improved algorithm based on Q-Learning.According to the researched agent's action control and decision-making behavior,Atari 2600 was selected as the experimental environment,and the performance of the improved algorithm designed from the agent's action behavior,decision judgment and other aspects was tested.After long enough training and learning,this article The improved algorithm designed in terms of motion control and behavior decision-making is more intelligent and autonomous than the unimproved comparison algorithm.Compared with the comparison algorithm in the same experiment in the macro performance,it is also better in the round score.In a shorter training period,it obtains higher learning efficiency,which reflects that the improved algorithm not only in the action behavior of the agent The control has been greatly improved,and it also has a faster adaptability in the environment to achieve the effect of quickly accumulating learning experience.After comparative analysis of experiments,the improved algorithm designed inthis paper has improved and improved the performance of agent's motion control and decision-making.
Keywords/Search Tags:Reinforcement learning, Q-Learning, motion control, decision-making
PDF Full Text Request
Related items