Font Size: a A A

Research On Intelligent Agricultural Vehicle Behavior Decision-Making Algorithm Based On Deep Reinforcement Learning

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:J ZouFull Text:PDF
GTID:2493306605495114Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
As an important carrier of precision agriculture,intelligent agricultural vehicles have been widely used in various occasions of modern agricultural production.Behavioral decision-making is a key content in the research of intelligent agricultural vehicles.Traditional rule-based behavioral decision-making algorithms have poor adaptability in the face of complex and changing agricultural production environments.The development of computer science and artificial intelligence technology is the solution to this problem.Here comes the dawn.This paper proposes a behavioral decision-making algorithm based on deep reinforcement learning based on the behavioral decision-making mechanism of intelligent agricultural vehicles.The trained algorithm can rely on fewer sensors to realize the "end-toend" behavioral decision-making of intelligent agricultural vehicles in different scenarios,By controlling the direction and speed of the trolley to achieve path planning and dynamic obstacle avoidance,and show good environmental adaptability.The main research contents of this article are as follows:(1)The characteristics of the current agricultural vehicle behavior decision-making process and artificial intelligence research are analyzed,and the algorithm framework of intelligent agricultural vehicles from environment perception to behavior decision-making and intelligent agricultural vehicles based on ROS system are designed.In order to improve the convergence speed of the deep reinforcement learning algorithm,this paper proposes the deep strong learning algorithm DDPGwTE combined with the teaching experience based on the teaching learning idea,and improves the experience playback mechanism with high TD error experience priority playback processing.(2)Various parts of the algorithm are designed and improved.①Combining the communication mechanism of the ROS system and the Odom odometer system to model the environmental state input and behavior output of the algorithm;②Analyzing the motion of the four-wheel differential car and solve the motion of the car into linear speed and angular speed to adapt to the ROS system control;③The corresponding reward function is designed according to the status update characteristics;④The deep neural network in the algorithm model is constructed,and the corresponding loss function is designed according to the priority experience playback weight;⑤The artificial teaching experience pool and the exploration experience pool are designed Sampling ratio mechanism.(3)Simulation scenario was designed and established in Gazebo,teaching experience was obtained and the classic DDPG algorithm and the DDPGwTE algorithm proposed in this paper was simulated in Gazebo.Logitech gamepads are used to manually operate the trolley in the simulation environment to obtain teaching experience.Finally,the classic DDPG algorithm and the DDPGwTE algorithm proposed in this paper are simulated in three scenarios.The simulation results show that the DDPGwTE algorithm can achieve convergence in 1000 training rounds combined with manual teaching experience,and the total number of steps in 5000 rounds is reduced by 7.17%,12.86%,and 12.77%respectively compared with the classic DDPG algorithm.(4)The ROS smart trolley was designed and produced,migration experiments on the algorithm of training convergence was carried out in the simulation environment.Based on the simulation scenarios,the experiments in three scenarios of obstacles-free,fixed obstacles,and mobile obstacles w The ROS smart car was designed and produced,and the migration experiment of the training convergence algorithm in the simulation environment was carried out.The experiments under the three scenarios of barrier-free,fixed obstacles and mobile obstacles were designed according to the simulation scenarios,and the dynamic window method(DWA)was used as the experimental control group.First of all,the laser SLAM mapping was carried out on the experimental environment,and the speed and instantaneous reward change during the experiment of the car controlled by the DDPGwTE algorithm were read by SSH remote login to the car,and the trajectory of the two algorithms during navigation and obstacle avoidance was recorded The effectiveness of the behavioral decision of the deep reinforcement learning algorithm is analyzed by the speed and instantaneous reward changes corresponding to the trajectory of the car.In this paper,combined with the characteristics of intelligent agricultural vehicle control and agricultural production environment,an "end-to-end" behavior decision algorithm based on deep reinforcement learning is designed,and the algorithm is simulated and trained on the simulation environment and experimental car based on ROS platform.And experiments,the experimental results verify the effectiveness of the algorithm’s behavioral decision-making in complex environments,which is of great significance for improving the level of agricultural intelligence.
Keywords/Search Tags:Behavioral decision-making, Deep reinforcement learning, Teaching experience, ROS system
PDF Full Text Request
Related items