Font Size: a A A

Research On Resource Allocation Of Cognitive Internet Of Things Based On Deep Reinforcement Learning

Posted on:2024-07-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:S A GuoFull Text:PDF
GTID:1528307064975159Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The vision of the Internet of everything has been proposed in the application of 5G mobile communication technology to cope with the explosive growth of mobile data traffic,massive device connections,and various emerging services and application scenarios in the future.However,with the emergence of the unprecedented Internet of everything,the problem of scarcity of spectrum resources has become increasingly prominent.In order to cope with the current scarcity of spectrum resources and to meet the huge spectrum demand in the upcoming 6G mobile communication era,it is a serious challenge for us to improve spectrum utilization.Cognitive radio(CR)technology solves the problem of underutilization of licensed spectrum through the redevelopment of licensed spectrum.Therefore,under the rapid development trend of the Internet of everything,the introduction of cognitive radio technology into the Internet of things(IoT)can effectively alleviate the contradiction between the access of massive devices and the shortage of spectrum resources.In addition to improving spectrum utilization,improving energy efficiency to address the ultra-high energy consumption of massive IoT devices and expanding the coverage of the IoT to achieve seamless global services anywhere and anytime are also important challenges that deserves our attention.From the concept of green communication,radio frequency(RF)energy harvesting(EH)technology can convert received radio frequency signals into electrical energy,while ambient backscatter communication(ABC)technology can utilize surrounding radio frequency signals to transmit data.These two technologies can effectively alleviate the problem of ultra-high energy consumption.On the other hand,in order to realize the ubiquitous services of the IoT on a global scale,unmanned aerial vehicle(UAV)-assisted communication technology with high flexibility and high maneuverability provides a reliable and low-cost solution for it.In the research process of this topic,the main difficulty faced is how to achieve effective resource allocation in a highly dynamic and complex network environment.In a cognitive IoT,IoT devices,as secondary users(SUs),can dynamically adjust transmission parameters according to the environment and use licensed frequency bands in an opportunistic access manner.Under the premise of avoiding the impact on the communication of the primary users(PUs),the reasonable allocation of limited resources in the IoT is crucial to ensure the quality of communication for both primary and secondary networks and improve resource utilization.At present,most of the researches on the resource allocation problem of cognitive IoT are carried out on the premise of knowing the prior statistical knowledge of the environment.However,the cognitive IoT is a highly complex system,and such prior knowledge is difficult to obtain in practical situations.Therefore,the resource allocation problem in a cognitive IoT without prior statistical knowledge is an extremely challenging work.Reinforcement learning,as a model-free method,can find the optimal strategy through continuous trial-and-error learning according to the feedback from the environment without knowing the environmental model in advance.Therefore,deep reinforcement learning algorithms can be foreseen as an effective means to solve the dynamic resource allocation problem in cognitive IoT.According to the above research background,this paper focuses on three application scenarios of cognitive IoT combining RF energy harvesting,ambient backscatter communications,and UAV-assisted communication and proposes several DRL-based resource allocation algorithms for these scenarios.The main works and contributions of this paper are as follows:(1)For the green communication demand in cognitive IoT,in order to solve the problems of high energy consumption and power supply difficulties in cognitive IoT,a cognitive IoT system with RF energy harvesting is constructed in this paper.For this system,this paper proposes an optimization problem of joint multi-user access scheduling,operating mode selection(transmission mode\energy harvesting)and power allocation of SUs to maximize the throughput of the secondary system.Considering that the channel occupancy states of PUs,energy arrival model and statistical knowledge of the channel states cannot be obtained in advance,the proposed optimization problem is transformed into a Markov decision process(MDP)model and two deep reinforcement learning-based algorithms are proposed which are the deep Q network(DQN)-based joint mode selection and discrete power allocation(MS-DPA)algorithm and the deterministic policy gradient(DDPG)based joint mode selection and continuous power allocation(MS-CPA)algorithm.Finally,the feasibility and effectiveness of the proposed algorithms are verified by extensive computer simulation experiments.Simulation experimental results show that our proposed algorithms can effectively improve the throughput of the sub-user network while having a fast convergence speed.(2)In order to further improve spectrum utilization and energy efficiency of green communication-oriented cognitive IoT,this paper introduces ambient backscatter communication(ABC)into the RF-powered cognitive IoT and combines non-orthogonal multiple access(NOMA)technology to establish RF-powered cognitive backscatter IoT system.Two MDP-based optimization problems are proposed for two spectrum sharing modes,i.e.,underlay-interweave and overlay-interweave.To ensure that the communication quality of the PU is not affected by the SUs,we design reward functions with penalty terms for the two MDPs respectively.Considering that the environmental model of the dynamic system cannot be obtained in advance in practical situations,we propose a DDPG-based joint reflection coefficient adjustment and resource allocation(JCARA)algorithm to solve the two optimization problems.For the underlay-interweave scenario,the proposed JCARA algorithm jointly optimizes the transmit power and the reflection coefficients of the SUs,while for the overlay-interweave scenario it optimizes the above two variables plus time resource simultaneously.The experimental results of computer simulation show that our proposed JCARA algorithm can achieve higher throughput compared to other comparative algorithms.(3)For the demand of wide coverage of cognitive IoT with consideration of the scenarios without ground infrastructure coverage,this paper investigates the resource allocation problem of cognitive satellite-aerial network for IoT applications.In this network,the UAVs,as SUs,access to the spectrum of the satellite network through a underlay spectrum sharing mode under the premise that the total interference caused by them to the satellite network is below the interference temperature threshold.To meet the delay-sensitive quality of service(Qo S)requirement of the secondary network,we formulate a joint optimization problem of trajectory control and power allocation for the multi-UAVs to minimize the transmission latency for all the ground users over a long-term task period.In order to solve this complex non-convex optimization problem with multiple constraints,while reducing the computational complexity and signaling exchange in the execution phase,we transform the original optimization problem into a partially observable Markov decision process(POMDP)-based multi-agent reinforcement learning(MARL)problem.In this regard,we propose a multi-agent deep deterministic policy gradient(MADDPG)-based joint trajectory control and power allocation(JTCPA)algorithm to solve the optimization problem.As we can see from the experimental results of computer simulations,compared to other typical methods,this proposed algorithm can make better decisions based on less information of the environment to effectively reduce the transmission latency.
Keywords/Search Tags:Cognitive IoT, Resource allocation, Energy harvesting, Ambient backscatter communication, Cognitive satellite-UAV network, Deep reinforcement learning, Multi-agent reinforcement learning
PDF Full Text Request
Related items