| Opportunistic Network refers to a new type of wireless mobile network.The nodes in this network have no stable communication links between them due to frequent movement,sparse distribution,and limited communication range,and need to rely on the chance encounter of nodes for message forwarding.Therefore,how to select the best next-hop node among the network nodes is the focus of research on the opportunistic network routing algorithm.Aiming at the characteristics of opportunity networks,a Q-Learning based routing algorithm for opportunistic network will be investigated in this thesis.In this thesis,firstly,the theoretical knowledge and key technologies related to opportunistic network are studied.The current research status of opportunistic network routing algorithms is analyzed,and the application of reinforcement learning in opportunistic network is introduced in detail.The foundation for the research design and simulation of opportunistic network routing algorithms is laid.To address the changing topology of opportunistic network,a Q-Learning-based opportunistic network model is established in this thesis.To enhance the unbiasedness of the Q-Learning algorithm and overcome the issue of algorithm’s inability to fully utilize multi-step future information,in this thesis,a K-Step Double Q-Learning Routing(K-DQLR)algorithm for opportunistic networks is proposed,which integrates the multi-step and double Q-Learning algorithm.A dynamic reward mechanism is adopted by the algorithm to differentiate the reward levels for different forwarding paths.Specifically,two metrics,namely the number of hops and node centrality,are employed by the method to comprehensively evaluate the propagation paths of information in the network,thus achieving differentiated reward processing for different forwarding paths.The number of hops reflects the span of information propagation in the network,while node centrality represents the degree of importance of nodes in the information dissemination process.Therefore,by considering these two important features,the algorithm better captures the characteristics and patterns of information dissemination in the network,thereby effectively improving the efficiency and accuracy of information dissemination in practical applications.In addition,to dig deeper into the potential forwarding nodes,a value transfer mechanism is designed in this thesis to provide a better reward base to potential next-hop nodes.Based on the analysis of experimental results,the K-DQLR algorithm not only achieves improved message delivery rates and reduced transmission delays but also successfully reduces the consumption of network resources.This indicates that the K-DQLR algorithm not only exhibits significant performance advantages but also utilizes network resources more efficiently and economically.Further,considering the complexity and diversity of node attributes in the opportunistic network,the states of the network nodes are further extended in this thesis by incorporating the position and velocity of the nodes,the position and velocity of the encounter nodes,and the destination node as the current node’s state.Deep Reinforcement Learning(DRL)is also introduced,and Deep Double Q Network Routing(DDQNR)for opportunistic networks is proposed,which uses a deep neural network instead of Q tables thus solving the problem of high node state dimensionality.In addition,the effect of node speed is considered in the immediate reward value of the algorithm,and nodes that move faster have a higher chance of meeting the destination node and delivering messages to the destination node faster,thus effectively reducing the delay.According to the experimental results,DDQNR can effectively improve message delivery ratio compared to the K-DQLR algorithm. |