Font Size: a A A

Research On High-Dimensional And Variable-Dimensional Video Caching Selection Method Based On Deep Reinforcement Learning

Posted on:2024-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:R ChenFull Text:PDF
GTID:2568307136494894Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present,people watch the videos more and more frequently.Multimedia services and applications have been involved in every aspect of people’s life.The huge traffic generated by a large number of users simultaneously requesting videos from the backbone network puts heavy pressure on the backbone network.By combining the video caching and edge computing,the edge server can select the videos those the users watch more times for caching,which can improve the performance of multimedia services system.The popularity of the videos changes over time,so video caching selection has the dynamic characteristic.Besides,the edge server needs to select some videos from a large number of videos for caching,so video caching selection has the high-dimensional characteristic.In addition,the number of the online videos varies over time,so video caching selection has the variable-dimensional characteristic.In order to enable the edge server to select the videos for caching efficiently,this thesis studies a high-dimensional and variable-dimensional video caching selection method based on deep reinforcement learning.The main works of this thesis are as follows:(1)In view of the dynamic and the high-dimensional characteristics,a high-dimensional video caching selection method based on the improved deep deterministic policy gradient(DDPG)is proposed.First,the system model of the high-dimensional video caching selection problem is constructed.Second,the decoder is used to improve the DDPG,and then the video caching action is selected based on the improved DDPG.Finally,simulation results show that the high-dimensional video caching selection method based on the improved DDPG can reduce the time delay of video transmission and the traffic cost of user expense,compared with similar methods.(2)In view of the dynamic and the variable-dimensional characteristics,a variable-dimensional video caching selection method based on the improved proximal policy optimization(PPO)is proposed.First,the system model of the variable-dimensional video caching selection problem is constructed.Second,incremental learning is used to adjust the dimension of the variable-dimensional data generated by the video caching selection.Third,the model-based method is used to improve the PPO,and then the video caching action is selected based on the improved PPO.Finally,simulation results show that the variable-dimensional video caching selection method based on the improved PPO has the better effect,compared with similar methods.In particular,for the video provider,the proposed method can increase the income and reduce the cost,for the users,the proposed method can reduce the time delay of video transmission and the traffic cost of user expense.
Keywords/Search Tags:Video caching selection, Deep reinforcement learning, Deep deterministic policy gradient, Proximal policy optimization, Edge computing
PDF Full Text Request
Related items