Font Size: a A A

Research On AUV Task Planning Technology Based On Reinforcement Learning

Posted on:2020-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:C A DaiFull Text:PDF
GTID:2392330575968645Subject:Ships and marine structures, design of manufacturing
Abstract/Summary:PDF Full Text Request
In recent years,marine resources,rights and interests receive extensive concern from lots of people.As an intelligent autonomous underwater vehicle(AUV),AUV gains the great development.In relevant technologies of AUV,the mission planning is an important constituent part.The result of the mission planning will cause a direct influence on mission completion and self-security of AUV By aiming at the mission planning technology of AUV,the author studies from the system structure design,air route planning and multi-target point detection mission planning.In order to improve intelligence and adaptability of seabed exploratory AUV,"Chengsha"AUV self-learning system structure design is accomplished:by combining with the reinforcement learning thought,study evolution unit is designed,thus AUV can improve accuracy of actuators' output results by virtue of the environmental interaction.In this way,AUV constantly improves the intelligence level in the learning process.By adding the design of the learning supervision unit,the action selection strategy can be adjusted dynamically on line in accordance with different environmental complexity and learning progress of AUV,thus this strategy will greatly correspond to the environment and accelerate the learning process.As designing the task hierarchy and rearrange the unit,the hierarchical module design of environmental space is added.Hierarchy of working environment is used to obtain the subtask of different levels.The rearrangement module is used to rearrange the task in accordance with the subtask affected by the emergency situation,so as to improve accuracy and timeliness of planning.By combining with the above-mentioned unit,"Chengsha" AUV self-learning system structure framework design is accomplished.In order to solve the global planning problem of AUV,Q learning mode can improve intelligence level and planning reasonability of AUV,while exploration-exploration balance problem affects good or bad planning results.Insufficient exploration will result in getting caught in the locally optimal solution,while excessive exploration will reduce learning efficiency.With the purpose of solving the above-mentioned problems,the author comes up with the self-adaptive Q learning method of environmental complexity evaluation.Evaluating the environmental complexity can greatly determine the exploration—exploitation balance strategy.In the planning process,the strategy is constantly adjusted in accordance with the physical truth,so as to avoid from get caught in the locally optimal solution,accelerate algorithm convergence,and speed up air route planning.To solve the route optimization problem,the author puts forward the mathematical model of best digging angle and improves the air route reasonability and security after fairing.The existing algorithm has the larger calculated amount and longer planning time,thus hierarchical Q learning algorithm of regional division is proposed,for the sake of solving the multi-target-point detection mission planning for AUV.In addition,the author divides the environmental state space according to the area of the target points to obtain the corresponding subdomain and subtask.In this way,the complicated mission is transformed into the simple subtask sequence,so as to reduce the planned calculated amount and time,design the dynamic online planning scheme,and improve the timeliness of planning.The author completes the air route planning and rearrangement of multi-target-point detection task of AUV.At last,by combining with practical island environment,the author evaluates the learning model,combines the "Long Island" environment to do the simulation experiment for the self-adaptive Q learning model,and uses the "Longsea" environment for the simulation experiment of hierarchical Q learning model with regional division.By analyzing the simulation results,the author verifies that the proposed learning model can optimize the results of mission planning and improve efficiency of mission planning.
Keywords/Search Tags:autonomous underwater vehicle, mission planning, system structure, route planning, Q learning
PDF Full Text Request
Related items