Font Size: a A A

Research On Multi-UAV Cooperative Decision-making For Target Tracking Under Partial Observable Conditions

Posted on:2020-08-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y ZhaoFull Text:PDF
GTID:1362330611993097Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Unmanned aerial vehicles(UAVs)cooperative tracking multi-target is an important research direction in the field of UAVs decision-making.It has a wide range of applications and theoretical importance in the military and civilian uses.This dissertation takes the multi-UAV cooperative reconnaissance as the background and focuses on the cooperative decision-making problem of multi-fixed-wing UAVs tracking multiple ground moving targets under partial observable conditions.Based on the information geometry,the system modeling,algorithm designing and solution optimization are studied.The main contributions are as follows:(1)Considering the key features of multi-UAV cooperative target tracking problem,this dissertation establishes and examines a decentralized partially observable Markov decision processes(Dec-POMDPs)decision model and proposes the corresponding solution framework for multi-UAV cooperative tracking multiple targets,which integrate the state uncertainty and partial observability into a unified decision-making framework and have good adaptability.Based on the multi-UAV cooperative target reconnaissance missions,this dissertation analyzes the partially observable characteristics of the multi-UAV system with limited perception and communication abilities.Furthermore,the multi-UAV cooperative decision-making problem for tracking multi-target under partial observable conditions is formally defined.To be more specific,by analyzing the single UAV tracking decision-making problem,the POMDPs decision model is established.To match the features of on-board sensors and meet the requirements of online decision-making,this dissertation proposes a Fisher-information based reward function design method.Then,the decision process of single UAV is extended to that of multiple UAVs.Based on Dec-POMDPs,a multi-UAV decision-making model is established.(2)To handle the inconsistency in perceptual information of multiUAV systems,a max-consensus based distributed information fusion estimation algorithm is designed,and its convergence is analyzed.This method can achieve a global consensus in a short time period with only utilizing local information exchanges.In the multi-UAV target tracking problem,fusing the observation information from different UAVs and obtaining accurate and consistent target state estimates play a fundamental role in optimizing the UAV cooperative strategy.Considering the limited communication range of UAVs,the limited observation distance of on-board sensors,in order to achieve the consistency in distributed fusion estimation,this dissertation proposes a Kalman-filter based distributed max-consensus information fusion algorithm.In this algorithm,the number of information exchanges to achieve information consistency is lower bounded.The simulation results show that the proposed algorithm has scalability and can adapt the time-varying communication topology.Besides,it can reduce the communication overhead.(3)For the optimal target state estimation oriented multi-UAV decisionmaking problem,based on dynamic programming,a near-optimal method for single UAV sequential action strategy is designed and a distributed solution method for multi-UAV cooperative strategy is proposed.The above methods can significantly improve the computational efficiency.More specifically,a nominal belief-state optimization(NBO)based rolling time domain approximate solution method is proposed to solve the single UAV decisionmaking problem,and the stability of the proposed method is proved.Considering the difficulty in deriving the optimal strategy of multiple UAVs under the DecPOMDP framework,this dissertation establishes a distributed evaluation model of multi-UAV behavior strategy by using the determinant of Fisher information matrix as the objective function,and then proposes a Kuhn-Munkres(KM)based distributed cooperative strategy solution method.Simulation results show that the proposed algorithm outperforms the decentralized methods in the literature and has a similar performance to the centralized method.Furthermore,our method has higher scalability and computational efficiency.(4)For the model-free multi-UAV decision-making problem,a naturalgradient based UAV target tracking reinforcement learning method is proposed to guarantee the learning convergence and improve the learning efficiency.Moreover,multi-UAV centralized Critic and fully distributed reinforcement learning methods are given to solve multi-UAV cooperative learning problems.Based on the Fisher information metric in the information geometry,the natural gradient is used to replace the conventional gradient in the temporal-difference actor-critic(TC-AC)algorithm,which improves the efficiency of reinforcement learning.Considering at the instability of the multi-UAV reinforcement learning,we propose a centralized Critic reinforcement learning method under the multi-UAV target tracking decision-making framework,by using the centralized evaluation and distributed execution.It can use the same value function to evaluate the action value of each Actor correctly.Furthermore,to meet the requirements of the distributed architecture,a distributed learning method is proposed,where the Critic is design to be decentralized,and the actions of other UAVs are included into the evaluation system of Actor.When applying these above two algorithms to classic multi-UAV tracking systems,the learning process performs well in convergence.Since the linearization strategy model is obtained through learning,the generalization performance and computational efficiency of online decision-making is remarkably improved.
Keywords/Search Tags:Multi-UAV cooperative decision-making, Target tracking, Information geometry, Dynamic programming, Reinforcement learning
PDF Full Text Request
Related items