Font Size: a A A

The Control Of The Inverted Pendulum Based On Reinforcement Learning

Posted on:2005-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2168360122998827Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Since 1970's, people have explored all kinds of learning strategies and learning algorithms, and combined learning with kinds of applications in the same time. As a result, they won a great success and accelerate the development of the machine learning. In 1980,the first international workshop about machine learning in CMU was a symbol of the rising of the machine learning in the world. In 1989,Carbonell delivered an article and pointed out that there are four researching direction about machine learning: connection machine learning, symbol-based induced machine learning, genetic machine learning and analyzing machine learning. Ten years later, in 1997,Dietterich proposed other new four directions: the integration of the classified implements the instructive learning algorithm about magnanimity data, reinforcement learning, learning the complex statistical model.In 1954, "reinforcement" and "reinforcement learning" was firstly proposed by Minsky and appeared in engineering literature [26]. In 1965, Waltz and Jingsun Fu put forward the concept separately in controlling theory. From 1960's to 1970's, the research about reinforcement learning got along much slower, 1980's later, along with the researching on neural network and theprogress about computer technology, the researching on reinforcement learning appeared upsurge, gradually became the active field of the machine learning. The researcher through the world proposed kinds of learning algorithms and learning strategies, and applied reinforcement learning to many fields. Such as game competing, the earliest application example is Samuel chess program; scheduling optimization; the most application is robot field, controlling problem, the representative example is the controlling on the inverted pendulum.In stable controlling program, the inverted pendulum is universal as well as representative. As equipment it is cost is low and the structure is simple. As a controlling object it is much more complex, high steps, non-stable, non-linear, strong coupling system, only an effective method can make it be stable. When a new theory or method is proposed and can't be strictly proved, the inverted pendulum system can be used to validate its correctness and practicability. The researching on the inverted pendulum not only has the profound theory meaning but also has important engineering background. The helicopter, rocket flight, man-made satellite running, robot's weight lifting, doing gymnastics and hoofing are all similar to the stable controlling of the inverted pendulum system. So the researching on the inverted pendulum is of important practice meaning to the high technology such as the rocket flight and the controlling of the robot.On the base of the thorough summarizing about machine learning, reinforcement learning and the inverted pendulum, this paper apply the idea of reinforcement learning to the controlling of the one-link and two-link inverted pendulum, and further analyzethe learning result, the innovations in the paper is as follows:Firstly this paper combines the idea of reinforcement learning with the multidimensional linear interpolation to control the inverted pendulum. In the method, the state space is discrete, the rule as the value function to express the structure , and the reinforcement learning directly learn the force about controlling the inverted pendulum. The learning result indicates that the force learned is almost linear to the state variables. So it is a necessary preparation for learning the coefficients of the controlling equation of the inverted pendulum.Secondly through learning the coefficients of the controlling equation of the one-link and two-link inverted pendulum, the inverted pendulum can be controlled well. To the two-link inverted pendulum, the influence of the initial value of the coefficients of controlling equation to the learning is analyzed in this paper, the experiment shows that the initial value has certain influence to the learning time, but has little influence to the learning effect; the last controll...
Keywords/Search Tags:machine learning, reinforcement learning, inverted pendulum
PDF Full Text Request
Related items